Fix incorrect alignment mask in quantized SDPA allocation #15852

Copilot · 2025-11-17T16:45:14Z

Addresses review feedback on #15610 regarding incorrect alignment calculation for scratch_for_quant_dequant buffer allocation.

Bug

The alignment mask -(kAlignment - 1) produces incorrect results:

For kAlignment = 32: mask is -31 = 0xFFFFFFE1 (bit 0 remains set)
Results in off-by-one alignment: align(0) = 1, align(32) = 33, align(64) = 65

Fix

Changed mask to -kAlignment and added inline test case documentation:

// Before
size_per_thread_qdq_vec = (size_per_thread_qdq_vec + kAlignment - 1) & (-(kAlignment - 1));

// After  
size_per_thread_qdq_vec = (size_per_thread_qdq_vec + kAlignment - 1) & (-kAlignment);

For kAlignment = 32: mask is now -32 = 0xFFFFFFE0, properly zeroing lower 5 bits to align to 32-byte boundaries.

Documentation

Added minimal test case in code comments demonstrating:

Correct formula: -kAlignment = -32 = 0xFFFFFFE0 (zeros bits 0-4)
Wrong formula: -(kAlignment-1) = -31 = 0xFFFFFFE1 (bit 0 still set)
Concrete examples showing the off-by-one bug with the wrong mask

The inline comments provide clear examples with hex values illustrating why the incorrect mask fails to properly align values.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

pytorch-bot · 2025-11-17T16:45:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15852

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 118 Pending

As of commit 37bfa7c with merge base 7600df8 ():

NEW FAILURES - The following jobs have failed:

pull / unittest / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 2
pull / unittest-buck / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 2
pull / unittest-editable / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2025-11-17T16:45:29Z

Hi @Copilot!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

mergennachin · 2025-11-17T16:51:26Z

@copilot Create a test case in the comments

mergennachin · 2025-11-17T16:52:25Z

@copilot create a minimal test case in the comments

Copilot · 2025-11-17T16:54:57Z

@copilot Create a test case in the comments

Added a minimal test case in the code comments at lines 782-789 demonstrating:

Correct formula: (x + 32 - 1) & 0xFFFFFFE0 properly aligns to 32-byte boundaries
Wrong formula: (x + 32 - 1) & 0xFFFFFFE1 causes off-by-one errors (e.g., 32→33 instead of 32→32)

The test case shows concrete examples with hex values to illustrate why the wrong mask fails.

Fixed in commit 1411727.

@copilot create a minimal test case in the comments

Added a minimal test case in the code comments at lines 782-789. The inline test case shows:

-kAlignment = -32 = 0xFFFFFFE0 (correct - zeros bits 0-4)
-(kAlignment-1) = -31 = 0xFFFFFFE1 (wrong - bit 0 remains set)

With concrete examples demonstrating the off-by-one bug.

Fixed in commit 1411727.

mergennachin · 2025-11-17T17:04:05Z

@kimishpatel Looks like AI is correct :)

Initial plan

37bfa7c

Copilot AI assigned Copilot and mergennachin Nov 17, 2025

Copilot AI mentioned this pull request Nov 17, 2025

Reduce allocation overhead in quantized sdpa #15610

Merged

Copilot started work on behalf of mergennachin November 17, 2025 16:45 View session

Copilot AI changed the title ~~[WIP] Address feedback on allocation overhead in quantized sdpa PR~~ Fix incorrect alignment mask in quantized SDPA allocation Nov 17, 2025

Copilot AI requested a review from mergennachin November 17, 2025 16:50

Copilot finished work on behalf of mergennachin November 17, 2025 16:50

mergennachin closed this Nov 17, 2025

mergennachin reopened this Nov 17, 2025

Copilot started work on behalf of mergennachin November 17, 2025 16:53 View session

Copilot finished work on behalf of mergennachin November 17, 2025 16:56

mergennachin closed this Nov 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix incorrect alignment mask in quantized SDPA allocation #15852

Fix incorrect alignment mask in quantized SDPA allocation #15852

Uh oh!

Copilot AI commented Nov 17, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 17, 2025 •

edited

Loading

Uh oh!

meta-cla bot commented Nov 17, 2025

Uh oh!

mergennachin commented Nov 17, 2025

Uh oh!

mergennachin commented Nov 17, 2025

Uh oh!

Copilot AI commented Nov 17, 2025 •

edited

Loading

Uh oh!

mergennachin commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix incorrect alignment mask in quantized SDPA allocation #15852

Fix incorrect alignment mask in quantized SDPA allocation #15852

Uh oh!

Conversation

Copilot AI commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug

Fix

Documentation

Uh oh!

pytorch-bot bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15852

❌ 3 New Failures, 118 Pending

Uh oh!

meta-cla bot commented Nov 17, 2025

Action Required

Process

Uh oh!

mergennachin commented Nov 17, 2025

Uh oh!

mergennachin commented Nov 17, 2025

Uh oh!

Copilot AI commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergennachin commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Nov 17, 2025 •

edited

Loading

pytorch-bot bot commented Nov 17, 2025 •

edited

Loading

Copilot AI commented Nov 17, 2025 •

edited

Loading