Skip to content

Conversation

@morrisonlevi
Copy link
Collaborator

Description

Gathers time samples during allocation samples if interrupt_count > 0. This has two benefits:

  1. A slight performance improvement in latency, CPU, and memory.
  2. A slight accuracy improvement.

When I say a slight performance benefit, here's a comparison to v1.15.2:

library_version latency_p99 cpu% memory
1.15.2 115.45 ms 132.9% 122.32 MB
local 114.88 ms 131.5% 122.27 MB

The accuracy benefit comes from the fact that when an interrupt is set, on master it won't be handled until the next VM interrupt. If we're gathering a memory sample and that count is non-zero, that means that we're gathering it sooner than it would have otherwise been handled, so less VM code can change. Notably, this should help with frameless functions (added in PHP 8.4) which allocate.

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

When allocation profiler takes a sample and interrupt_count > 0,
collect both allocation and time samples in a single stack walk.

Benefits:
- Eliminates redundant stack walks when samples coincide.
- Improves time sample accuracy (closer to timer expiry).
Let's do this in a dedicated PR.
@morrisonlevi morrisonlevi requested a review from a team as a code owner January 7, 2026 19:00
@github-actions github-actions bot added profiling Relates to the Continuous Profiler tracing labels Jan 7, 2026
@codecov-commenter
Copy link

codecov-commenter commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.82%. Comparing base (4a26042) to head (a3e3e4e).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3559      +/-   ##
==========================================
- Coverage   61.90%   61.82%   -0.08%     
==========================================
  Files         140      140              
  Lines       13281    13281              
  Branches     1758     1758              
==========================================
- Hits         8221     8211      -10     
- Misses       4272     4279       +7     
- Partials      788      791       +3     

see 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a26042...a3e3e4e. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pr-commenter
Copy link

pr-commenter bot commented Jan 7, 2026

Benchmarks [ profiler ]

Benchmark execution time: 2026-01-14 05:26:20

Comparing candidate commit a3e3e4e in PR branch levi/perf-time-alloc-piggyback with baseline commit 4a26042 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 29 metrics, 7 unstable metrics.


function allocate_large() {
// Large allocation to trigger sampling
$data = str_repeat("a", 1024 * 200_000); // ~200MB
Copy link
Member

@realFlowControl realFlowControl Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str_repeat() is not a frameless function, substr() or str_replace() are frameless function for example and do allocate memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@morrisonlevi morrisonlevi requested a review from a team as a code owner January 7, 2026 22:27
Copy link
Member

@realFlowControl realFlowControl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an upstream PR to fix the empty/malformed expected profile JSON problem that this PR showed. Can you fix the tests, especially the allocation_time_combined.json file?

Comment on lines 16 to 17
$s = str_repeat("x", 10_000_000);
strlen($s); // Use the string to prevent optimization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still uses str_repeat() which is not a frameless function, strlen() is, but it does not allocate and as such will not trigger a combined sample.
Can we add a text, maybe some Lorem Ipsum (or more nerdy https://future-lives.com/wp-content/uploads/2014/09/TheSentinel.pdf) and replace random words in that text? I'd think using an array as the needle to replace should take time and allocate enough to trigger sampling. Setting DD_PROFILING_ALLOCATION_SAMPLING_DISTANCE=1 should also help taking more allocation samples and increase the probability of triggering a combined sample.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire JSON has nothing to do with how the JSON's look that the https://github.com/DataDog/prof-correctness action accepts. As such prof-correctness seems to not match anything against the pprof which means it does not fail and as such succeeds (which is something we should fix in the upstream action, it should fail if the JSON is in an invalid structure).

Have a look at the other examples, like https://github.com/DataDog/dd-trace-php/blob/4a3a98716843640b4a8d6894da8ef0072500c729/profiling/tests/correctness/timeline.json how the JSON is supposed to look.

Copy link
Collaborator Author

@morrisonlevi morrisonlevi Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've sort of fixed it. How can I make this only run for PHP 8.4+? Because the % will be determined by whether it's frameless or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I make this only run for PHP 8.4+?

Not in this file, but in the workflow run itself (the .github/workflows/prof_correctness.yml), have a look how it is done for ZTS tests. You should be able to apply the same pattern for PHP 8.4+

But this brings me to a wider topic (one we should probably move to another discussion): str_repeat() as a frameless function should have less CPU time on PHP 8.4+ because of it being a frameless function, so depending on how big this effect is, we could "burry" this (but document it) in the error_margin or create two JSON files, one to match for PHP < 8.4 and one >= 8.4 with different percent values.

Also: we should lower the error_margin from 30.

export DD_PROFILING_ENDPOINT_COLLECTION_ENABLED=no
export EXECUTION_TIME=5
php -d extension=$PWD/target/profiler-release/libdatadog_php_profiling.so profiling/tests/correctness/allocation_time_combined.php
unset DD_PROFILING_ALLOCATION_ENABLED DD_PROFILING_WALL_TIME_ENABLED DD_PROFILING_EXPERIMENTAL_CPU_TIME_ENABLED DD_PROFILING_ENDPOINT_COLLECTION_ENABLED EXECUTION_TIME
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unset DD_PROFILING_ALLOCATION_ENABLED DD_PROFILING_WALL_TIME_ENABLED DD_PROFILING_EXPERIMENTAL_CPU_TIME_ENABLED DD_PROFILING_ENDPOINT_COLLECTION_ENABLED EXECUTION_TIME
unset DD_PROFILING_ALLOCATION_ENABLED DD_PROFILING_WALL_TIME_ENABLED DD_PROFILING_EXPERIMENTAL_CPU_TIME_ENABLED DD_PROFILING_ENDPOINT_COLLECTION_ENABLED EXECUTION_TIME DD_PROFILING_ALLOCATION_SAMPLING_DISTANCE

Just if you accept the change above

@realFlowControl
Copy link
Member

I've created an upstream PR to fix the empty/malformed expected profile JSON [...]

This got merged and I re-run the correctness tests, which are now failing.

{
"regular_expression": "<?php;main;standard\\|str_replace$",
"percent": 1.0,
"error_margin": 0.9
Copy link
Member

@realFlowControl realFlowControl Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error margin of 0.9 is interestingly specific ... and from the prof correctness test ...

    pprof_analysis.go:667: Error opening file /github/workspace/profiling/tests/correctness/allocation_time_combined.json: JSON schema validation failed for /github/workspace/profiling/tests/correctness/allocation_time_combined.json:
          - stacks.2.stack-content.1.error_margin: Invalid type. Expected: integer, given: number

... not allowed, this needs to be an integer, not a float.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because I want a non-zero value, and the true rate is around 0.5-1.0%. Setting a percent of 1 with margin of 1 would allow for 0, which is NOT good.

Comment on lines 38 to 42
{
"regular_expression": "<?php;main$",
"percent": 80,
"error_margin": 30
},
Copy link
Member

@realFlowControl realFlowControl Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are not interested in the main function itself.

Suggested change
{
"regular_expression": "<?php;main$",
"percent": 80,
"error_margin": 30
},

I think we could add the str_repeat() function though.

Comment on lines 8 to 17
{
"regular_expression": "<?php;main;standard\\|str_replace$",
"percent": 40,
"error_margin": 30
},
{
"regular_expression": "<?php;main;standard\\|str_repeat$",
"percent": 40,
"error_margin": 30
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you choose those percent values?
Just asking, because I guess that str_replace() and str_repeat() are the only (let's say at least 99%) functions that do allocations in that test script, so shouldn't it be safe to do:

Suggested change
{
"regular_expression": "<?php;main;standard\\|str_replace$",
"percent": 40,
"error_margin": 30
},
{
"regular_expression": "<?php;main;standard\\|str_repeat$",
"percent": 40,
"error_margin": 30
}
{
"regular_expression": "<?php;main;standard\\|str_replace$",
"percent": 50,
"error_margin": 1
},
{
"regular_expression": "<?php;main;standard\\|str_repeat$",
"percent": 50,
"error_margin": 1
}

The same might be true for alloc-samples bellow.

},
{
"regular_expression": "<?php;main;standard\\|str_replace$",
"percent": 1.0,
Copy link
Member

@realFlowControl realFlowControl Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an integer, not a float. I suspect it will fail once the error_margin line below is fixed. This is just not failing in the JSON validation, as the percent value (and others) are validated in Go code after the JSON schema validation succeeded.

The value of 1 might be too little as well.

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Jan 12, 2026

⚠️ Tests

Fix all issues with Cursor

⚠️ Warnings

🧪 2063 Tests failed

    testSearchPhpBinaries from integration.DDTrace\Tests\Integration\PHPInstallerTest (Fix with Cursor)

    testSimplePushAndProcess from laravel-58-test.DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest (Fix with Cursor)

testSimplePushAndProcess from laravel-8x-test.DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 696727fc000000003b9688b247ea41f3
tid: 696727fc00000000
hexProcessTraceId: 3b9688b247ea41f3
hexProcessSpanId: de34d484bc391137
processTraceId: 4293769594036437491
processSpanId: 16011656241767584055
View all

ℹ️ Info

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: a3e3e4e | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@morrisonlevi morrisonlevi force-pushed the levi/perf-time-alloc-piggyback branch from 03f3c8e to f52cf55 Compare January 13, 2026 23:28
@morrisonlevi morrisonlevi force-pushed the levi/perf-time-alloc-piggyback branch from 978ef91 to fc62464 Compare January 14, 2026 00:05
@morrisonlevi morrisonlevi force-pushed the levi/perf-time-alloc-piggyback branch from fc62464 to ee910ea Compare January 14, 2026 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

profiling Relates to the Continuous Profiler tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants