Skip to content

Commit cb6c8b1

Browse files
committed
Add support for GPT OSS
1 parent 91b44eb commit cb6c8b1

File tree

6 files changed

+80
-2
lines changed

6 files changed

+80
-2
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
320320
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://huggingface.co/papers/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
321321
1. **[GPT Neo](https://huggingface.co/docs/transformers/model_doc/gpt_neo)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
322322
1. **[GPT NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)** (from EleutherAI) released with the paper [GPT-NeoX-20B: An Open-Source Autoregressive Language Model](https://huggingface.co/papers/2204.06745) by Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach
323+
1. **[GPT OSS](https://huggingface.co/docs/transformers/model_doc/gpt_oss)** (from OpenAI) released with the blog [Introducing gpt-oss](https://openai.com/index/introducing-gpt-oss/) by Sandhini Agarwal, Lama Ahmad, Jason Ai, Sam Altman, Andy Applebaum, Edwin Arbus, Rahul K. Arora, Yu Bai, Bowen Baker, Haiming Bao, Boaz Barak, Ally Bennett, Tyler Bertao, Nivedita Brett, Eugene Brevdo, Greg Brockman, Sebastien Bubeck, Che Chang, Kai Chen, Mark Chen, Enoch Cheung, Aidan Clark, Dan Cook, Marat Dukhan, Casey Dvorak, Kevin Fives, Vlad Fomenko, Timur Garipov, Kristian Georgiev, Mia Glaese, Tarun Gogineni, Adam Goucher, Lukas Gross, Katia Gil Guzman, John Hallman, Jackie Hehir, Johannes Heidecke, Alec Helyar, Haitang Hu, Romain Huet, Jacob Huh, Saachi Jain, Zach Johnson, Chris Koch, Irina Kofman, Dominik Kundel, Jason Kwon, Volodymyr Kyrylov, Elaine Ya Le, Guillaume Leclerc, James Park Lennon, Scott Lessans, Mario Lezcano-Casado, Yuanzhi Li, Zhuohan Li, Ji Lin, Jordan Liss, Lily (Xiaoxuan) Liu, Jiancheng Liu, Kevin Lu, Chris Lu, Zoran Martinovic, Lindsay McCallum, Josh McGrath, Scott McKinney, Aidan McLaughlin, Song Mei, Steve Mostovoy, Tong Mu, Gideon Myles, Alexander Neitz, Alex Nichol, Jakub Pachocki, Alex Paino, Dana Palmie, Ashley Pantuliano, Giambattista Parascandolo, Jongsoo Park, Leher Pathak, Carolina Paz, Ludovic Peran, Dmitry Pimenov, Michelle Pokrass, Elizabeth Proehl, Huida Qiu, Gaby Raila, Filippo Raso, Hongyu Ren, Kimmy Richardson, David Robinson, Bob Rotsted, Hadi Salman, Suvansh Sanjeev, Max Schwarzer, D. Sculley, Harshit Sikchi, Kendal Simon, Karan Singhal, Yang Song, Dane Stuckey, Zhiqing Sun, Philippe Tillet, Sam Toizer, Foivos Tsimpourlas, Nikhil Vyas, Eric Wallace, Xin Wang, Miles Wang, Olivia Watkins, Kevin Weil, Amy Wendling, Kevin Whinnery, Cedric Whitney, Hannah Wong, Lin Yang, Yu Yang, Michihiro Yasunaga, Kristen Ying, Wojciech Zaremba, Wenting Zhan, Cyril Zhang, Brian Zhang, Eddie Zhang, Shengjia Zhao.
323324
1. **[GPT-2](https://huggingface.co/docs/transformers/model_doc/gpt2)** (from OpenAI) released with the paper [Language Models are Unsupervised Multitask Learners](https://blog.openai.com/better-language-models/) by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
324325
1. **[GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj)** (from EleutherAI) released in the repository [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) by Ben Wang and Aran Komatsuzaki.
325326
1. **[GPTBigCode](https://huggingface.co/docs/transformers/model_doc/gpt_bigcode)** (from BigCode) released with the paper [SantaCoder: don't reach for the stars!](https://huggingface.co/papers/2301.03988) by Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra.

docs/snippets/5_supported-models.snippet

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://huggingface.co/papers/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
5656
1. **[GPT Neo](https://huggingface.co/docs/transformers/model_doc/gpt_neo)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
5757
1. **[GPT NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)** (from EleutherAI) released with the paper [GPT-NeoX-20B: An Open-Source Autoregressive Language Model](https://huggingface.co/papers/2204.06745) by Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach
58+
1. **[GPT OSS](https://huggingface.co/docs/transformers/model_doc/gpt_oss)** (from OpenAI) released with the blog [Introducing gpt-oss](https://openai.com/index/introducing-gpt-oss/) by Sandhini Agarwal, Lama Ahmad, Jason Ai, Sam Altman, Andy Applebaum, Edwin Arbus, Rahul K. Arora, Yu Bai, Bowen Baker, Haiming Bao, Boaz Barak, Ally Bennett, Tyler Bertao, Nivedita Brett, Eugene Brevdo, Greg Brockman, Sebastien Bubeck, Che Chang, Kai Chen, Mark Chen, Enoch Cheung, Aidan Clark, Dan Cook, Marat Dukhan, Casey Dvorak, Kevin Fives, Vlad Fomenko, Timur Garipov, Kristian Georgiev, Mia Glaese, Tarun Gogineni, Adam Goucher, Lukas Gross, Katia Gil Guzman, John Hallman, Jackie Hehir, Johannes Heidecke, Alec Helyar, Haitang Hu, Romain Huet, Jacob Huh, Saachi Jain, Zach Johnson, Chris Koch, Irina Kofman, Dominik Kundel, Jason Kwon, Volodymyr Kyrylov, Elaine Ya Le, Guillaume Leclerc, James Park Lennon, Scott Lessans, Mario Lezcano-Casado, Yuanzhi Li, Zhuohan Li, Ji Lin, Jordan Liss, Lily (Xiaoxuan) Liu, Jiancheng Liu, Kevin Lu, Chris Lu, Zoran Martinovic, Lindsay McCallum, Josh McGrath, Scott McKinney, Aidan McLaughlin, Song Mei, Steve Mostovoy, Tong Mu, Gideon Myles, Alexander Neitz, Alex Nichol, Jakub Pachocki, Alex Paino, Dana Palmie, Ashley Pantuliano, Giambattista Parascandolo, Jongsoo Park, Leher Pathak, Carolina Paz, Ludovic Peran, Dmitry Pimenov, Michelle Pokrass, Elizabeth Proehl, Huida Qiu, Gaby Raila, Filippo Raso, Hongyu Ren, Kimmy Richardson, David Robinson, Bob Rotsted, Hadi Salman, Suvansh Sanjeev, Max Schwarzer, D. Sculley, Harshit Sikchi, Kendal Simon, Karan Singhal, Yang Song, Dane Stuckey, Zhiqing Sun, Philippe Tillet, Sam Toizer, Foivos Tsimpourlas, Nikhil Vyas, Eric Wallace, Xin Wang, Miles Wang, Olivia Watkins, Kevin Weil, Amy Wendling, Kevin Whinnery, Cedric Whitney, Hannah Wong, Lin Yang, Yu Yang, Michihiro Yasunaga, Kristen Ying, Wojciech Zaremba, Wenting Zhan, Cyril Zhang, Brian Zhang, Eddie Zhang, Shengjia Zhao.
5859
1. **[GPT-2](https://huggingface.co/docs/transformers/model_doc/gpt2)** (from OpenAI) released with the paper [Language Models are Unsupervised Multitask Learners](https://blog.openai.com/better-language-models/) by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
5960
1. **[GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj)** (from EleutherAI) released in the repository [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) by Ben Wang and Aran Komatsuzaki.
6061
1. **[GPTBigCode](https://huggingface.co/docs/transformers/model_doc/gpt_bigcode)** (from BigCode) released with the paper [SantaCoder: don't reach for the stars!](https://huggingface.co/papers/2301.03988) by Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra.

src/configs.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ function getNormalizedConfig(config) {
108108
mapping['num_layers'] = 'num_hidden_layers';
109109
mapping['hidden_size'] = 'hidden_size';
110110
break;
111+
case 'gpt_oss':
111112
case 'llama':
112113
case 'llama4_text':
113114
case 'nanochat':

src/models.js

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2051,7 +2051,10 @@ export class PreTrainedModel extends Callable {
20512051
// In most cases, this will be [batch_size, 1, vocab_size]
20522052
// So, we select the last token's logits:
20532053
// (equivalent to `logits = outputs.logits[:, -1, :]`)
2054-
const logits = outputs.logits.slice(null, -1, null);
2054+
// The `.to('float32')` is necessary for models with float16 logits,
2055+
// and is a no-op for float32 logits.
2056+
// TODO: Support float16 sampling in the sampler directly
2057+
const logits = outputs.logits.slice(null, -1, null).to('float32');
20552058

20562059
const next_tokens_scores = prepared_logits_processor(all_input_ids, logits);
20572060

@@ -4676,6 +4679,15 @@ export class GPT2LMHeadModel extends GPT2PreTrainedModel {}
46764679
// }
46774680
//////////////////////////////////////////////////
46784681

4682+
4683+
//////////////////////////////////////////////////
4684+
// GPT OSS models
4685+
export class GptOssPreTrainedModel extends PreTrainedModel {}
4686+
export class GptOssModel extends GptOssPreTrainedModel {}
4687+
export class GptOssForCausalLM extends GptOssPreTrainedModel {}
4688+
//////////////////////////////////////////////////
4689+
4690+
46794691
//////////////////////////////////////////////////
46804692
// JAIS models
46814693
export class JAISPreTrainedModel extends PreTrainedModel {}
@@ -8267,6 +8279,7 @@ const MODEL_MAPPING_NAMES_DECODER_ONLY = new Map([
82678279
['bloom', ['BloomModel', BloomModel]],
82688280
['jais', ['JAISModel', JAISModel]],
82698281
['gpt2', ['GPT2Model', GPT2Model]],
8282+
['gpt_oss', ['GptOssModel', GptOssModel]],
82708283
['gptj', ['GPTJModel', GPTJModel]],
82718284
['gpt_bigcode', ['GPTBigCodeModel', GPTBigCodeModel]],
82728285
['gpt_neo', ['GPTNeoModel', GPTNeoModel]],
@@ -8380,6 +8393,7 @@ const MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING_NAMES = new Map([
83808393
const MODEL_FOR_CAUSAL_LM_MAPPING_NAMES = new Map([
83818394
['bloom', ['BloomForCausalLM', BloomForCausalLM]],
83828395
['gpt2', ['GPT2LMHeadModel', GPT2LMHeadModel]],
8396+
['gpt_oss', ['GptOssForCausalLM', GptOssForCausalLM]],
83838397
['jais', ['JAISLMHeadModel', JAISLMHeadModel]],
83848398
['gptj', ['GPTJForCausalLM', GPTJForCausalLM]],
83858399
['gpt_bigcode', ['GPTBigCodeForCausalLM', GPTBigCodeForCausalLM]],

src/utils/maths.js

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1044,3 +1044,61 @@ export function dynamic_time_warping(matrix) {
10441044

10451045
return [text_indices, time_indices];
10461046
}
1047+
1048+
/**
1049+
* Efficiently converts a Uint16Array of float16 values to a Float32Array.
1050+
* This implementation uses a lazily initialized lookup table (LUT) for fast conversion.
1051+
*/
1052+
export const uint16_to_float32 = (function () {
1053+
let float16LUT = null; // The Lookup Table
1054+
1055+
return function (/** @type {Uint16Array} */ u16Array) {
1056+
if (!float16LUT) {
1057+
// Lazily initialize LUT
1058+
float16LUT = new Float32Array(65536);
1059+
const buffer = new ArrayBuffer(4);
1060+
const u32 = new Uint32Array(buffer);
1061+
const f32 = new Float32Array(buffer);
1062+
1063+
for (let i = 0; i < float16LUT.length; ++i) {
1064+
let outBits = 0;
1065+
const sign = (i & 0x8000) << 16;
1066+
const exp = (i & 0x7c00) >> 10;
1067+
let mantissa = i & 0x03ff;
1068+
1069+
if (exp === 0x1f) {
1070+
// Infinity or NaN
1071+
outBits = sign | 0x7f800000 | (mantissa << 13);
1072+
} else if (exp === 0) {
1073+
// Zero or Subnormal
1074+
if (mantissa === 0) {
1075+
outBits = sign;
1076+
} else {
1077+
let renormExp = 113;
1078+
while ((mantissa & 0x0400) === 0) {
1079+
mantissa <<= 1;
1080+
--renormExp;
1081+
}
1082+
mantissa &= ~0x0400;
1083+
outBits = sign | (renormExp << 23) | (mantissa << 13);
1084+
}
1085+
} else {
1086+
// Normal
1087+
outBits = sign | ((exp + 112) << 23) | (mantissa << 13);
1088+
}
1089+
1090+
u32[0] = outBits;
1091+
float16LUT[i] = f32[0];
1092+
}
1093+
}
1094+
1095+
const length = u16Array.length;
1096+
const lut = float16LUT;
1097+
const out = new Float32Array(length);
1098+
for (let i = 0; i < length; ++i) {
1099+
out[i] = lut[u16Array[i]];
1100+
}
1101+
1102+
return out;
1103+
};
1104+
})();

src/utils/tensor.js

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* @module utils/tensor
88
*/
99

10-
import { interpolate_data, max, min, permute_data } from './maths.js';
10+
import { interpolate_data, max, min, permute_data, uint16_to_float32 } from './maths.js';
1111

1212
import { Tensor as ONNXTensor, isONNXTensor } from '../backends/onnx.js';
1313

@@ -835,6 +835,9 @@ export class Tensor {
835835
} else {
836836
map_fn = BigInt;
837837
}
838+
} else if (this.type === 'float16' && type == 'float32' && this.data instanceof Uint16Array) {
839+
// Certain runtimes do not support Float16Array, so the values are stored in Uint16Array
840+
return new Tensor(type, uint16_to_float32(this.data), this.dims);
838841
}
839842

840843
// @ts-ignore

0 commit comments

Comments
 (0)