Skip to content

Commit 3826f44

Browse files
authored
Fix uninitialized aggregate_sampling_time_ms in Stats struct (#15820)
## Summary Fixes a critical bug where `aggregate_sampling_time_ms` in the `Stats` struct was not initialized, causing it to contain garbage data from uninitialized memory. ## Problem The `aggregate_sampling_time_ms` member variable was declared without initialization: ```cpp long aggregate_sampling_time_ms; // uninitialized! ``` This resulted in absurd sampling time reports like: ``` Sampling time over 68 tokens: 8433599.048000 (seconds) // ~97.5 days! ``` The actual sampling time should have been milliseconds, not millions of seconds. Since the code accumulates timing data onto this variable (`stats_.aggregate_sampling_time_ms += ...`), the garbage initial value propagated through all calculations. ## Solution Initialize the variable to zero in both locations: `long aggregate_sampling_time_ms = 0;` ## Impact After this fix, sampling time metrics will report realistic values (e.g., 0.010-0.100 seconds for typical token generation) instead of garbage values.
1 parent 8e33788 commit 3826f44

File tree

2 files changed

+2
-2
lines changed
  • examples/qualcomm/qaihub_scripts/llama/runner
  • extension/llm/runner

2 files changed

+2
-2
lines changed

examples/qualcomm/qaihub_scripts/llama/runner/runner.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ class Runner {
5555
// inference_end_ms: End of inference/generation.
5656
long inference_end_ms;
5757
// Keep a running total of the time spent in sampling.
58-
long aggregate_sampling_time_ms;
58+
long aggregate_sampling_time_ms = 0;
5959
// Token count from prompt
6060
int64_t num_prompt_tokens;
6161
// Token count from generated (total - prompt)

extension/llm/runner/stats.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ struct ET_EXPERIMENTAL Stats {
4444
// inference_end_ms: End of inference/generation.
4545
long inference_end_ms;
4646
// Keep a running total of the time spent in sampling.
47-
long aggregate_sampling_time_ms;
47+
long aggregate_sampling_time_ms = 0;
4848
// Token count from prompt
4949
int64_t num_prompt_tokens;
5050
// Token count from generated (total - prompt)

0 commit comments

Comments
 (0)