Skip to content

Commit 04d35ac

Browse files
authored
[CE]add wint4 ep (#5355)
1 parent d5a9b75 commit 04d35ac

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
num_gpu_blocks_override: 1024
2+
max_model_len: 8192
3+
max_num_seqs: 64
4+
data_parallel_size: 4
5+
tensor_parallel_size: 1
6+
enable_expert_parallel: True
7+
quantization: wint4

0 commit comments

Comments
 (0)