Commit d2ff109
Add offline dataset generation (#39)
* add offline dataset gen
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
* remove type hints for dynamic classes
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
* add evaluation logic for predictors
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
* add batch processing in dataset gen
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
* random sampling for sequence length
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
* add dataset append for memory efficiency
Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>
* add streaming support for training
Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>
* increase defaults for dataset generation
Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>
* rename files and enable full dataset loading
Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>
* add resume checkpointing
Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>
---------
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>
Co-authored-by: Varun Khare <varun.khare@niimbleedgehq.ai>1 parent d70d2b6 commit d2ff109
File tree
18 files changed
+1555
-1092
lines changed- src
- models
- llama
- mistral
- phi3
- qwen2
18 files changed
+1555
-1092
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
87 | 88 | | |
88 | 89 | | |
89 | 90 | | |
| 91 | + | |
90 | 92 | | |
91 | 93 | | |
92 | 94 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
File renamed without changes.
0 commit comments