Commit dca1f2c
support skip atten in export
Summary:
Support export for llama model variants with attention layer skipping. We only need to specify the attention skip patterns in config.json in layer_type. E.g.,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"skip_attention",
"skip_attention",
"skip_attention"
]
Differential Revision: D883995331 parent 4014597 commit dca1f2c
1 file changed
+7
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
135 | 138 | | |
136 | 139 | | |
137 | 140 | | |
| |||
272 | 275 | | |
273 | 276 | | |
274 | 277 | | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
275 | 282 | | |
276 | 283 | | |
277 | 284 | | |
| |||
0 commit comments