Skip to content

Commit 9e91412

Browse files
committed
[Cria][Lllama runner] Use caching temp allocator
Use of caching allocator improves TITO model performance by 6+ %. Will add repro instructions here but requires next diff to see the impact Differential Revision: [D85532078](https://our.internmc.facebook.com/intern/diff/D85532078/) ghstack-source-id: 327095993 Pull Request resolved: #16080
1 parent 15dc0f1 commit 9e91412

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

extension/llm/runner/llm_runner_helper.cpp

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,6 @@ std::unique_ptr<TextLLMRunner> create_text_llm_runner(
225225
max_cached_memory_size_bytes_));
226226
} else {
227227
module = std::make_unique<Module>(
228-
model_path,
229228
model_path,
230229
Module::LoadMode::File,
231230
std::move(event_tracer), // event tracer

0 commit comments

Comments
 (0)