Commit 72fd468
support skip atten in export (#16104)
Summary:
Support export for llama model variants with attention layer skipping. We only need to specify the attention skip patterns in config.json in layer_type. E.g.,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"skip_attention",
"skip_attention",
"skip_attention"
]
Differential Revision: D883995331 parent 4014597 commit 72fd468
1 file changed
+7
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
272 | 272 | | |
273 | 273 | | |
274 | 274 | | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
275 | 282 | | |
276 | 283 | | |
277 | 284 | | |
| |||
0 commit comments