Commit 576b1be
committed
Implementing speculative decoding for batch_size > 1; applicable if target and draft model have the same tokenizer.
1 parent 47e795b commit 576b1be
File tree
4 files changed
+449
-112
lines changed- src/transformers/generation
- tests/generation
4 files changed
+449
-112
lines changed
0 commit comments