Fix some minor details in comments for AVX2 decompose #762

jammychiou1 · 2025-12-01T03:31:41Z

The floor() in floor((f + 127) >> 7) was somewhat unecessary as the usual semantic for the right-shift operator (>>) has integer output anyway. Seeing as the right-shift operator is not used in other explanation comments, we decided to rewrite it as division by 2^7 for better consistency.
The bound of f1'' is correct but the proof was misleading. The new proof should be clearer.

jammychiou1 · 2025-12-01T03:40:26Z

I did consider addressing #654 also in this PR, but I haven't come up with a great idea for that yet...

- The floor() in floor((f + 127) >> 7) was somewhat unecessary as the usual semantic for the right-shift operator (>>) has integer output anyway. Seeing as the right-shift operator is not used in other explanation comments, we decided to rewrite it as division by 2^7 for better consistency. - The bound of f1'' is correct but the proof was misleading. The new proof should be clearer. Signed-off-by: jammychiou1 <jammy.chiou1@gmail.com>

Only the AVX2 one is updated currently. The NEON one will also be updated if this looks good. Signed-off-by: jammychiou1 <jammy.chiou1@gmail.com>

jammychiou1 · 2025-12-01T07:32:42Z

dev/x86_64/src/poly_decompose_32_avx2.c

     * Compute f1 = round-(f1' / B) ≈ round(f1' * 1025 / 2^22). This is exact
     * for 0 <= f1' < 2^16. Note that half is rounded down since 1025 / 2^22 ≲
-     * 1 / 4092.
+     * 1 / 4092, so (for example) f1' = B / 2 = 2046 is mapped to
+     *
+     *   round(2046 * 1025 / 2^22) = round(2046 * (1 / 4092 - epsilon))
+     *                             = round(1 / 2 - epsilon') = 0,
+     *
+     * where epsilon = 1 / 4092 - 1025 / 2^22 and epsilon' = 2046 * eps are both
+     * tiny but positive numbers.


@hanno-becker May I have your review on this explanation? If this looks good, I plan to also apply this to the aarch64 implementation and resolve #654. Thank you!

@jammychiou1 Apologies for the slow reply.

I think if we want to fix #654 we should provide a general explanation.

Here's an attempt:

The approximation error for `f1' / B ≈ f1' * 1025 / 2^22` is `f1' * (1025/2^22 - 1/B)`. For `eps := 1025/2^22 - 1/B` we have `eps = 1/4290772992 ~ 2^{-31.99} < 2^{-31}`. Hence `|f1'| * eps < 2^{-15}`. Given `B = 4092 ~ 2^12`, we thus have `|f1'| * eps < 1/B`. On the other hand, `1/B` is the spacing between the integral multiples of `1/B`, which includes all rounding boundaries `n + 1/2` (since `B` is even). Hence, if `f1' / B` is not of the form `n + 1/2`, then moving from `f1' / B` to `f1' * 1025 / 2^22` does not cross a rounding boundary, and hence `round(f1' / B) = round(f1' * 1025 / 2^22)`, and it doesn't matter on either side which version of rounding one uses. If `f1' / B` _is_ of the form `n + 1/2`, then `f1' * 1025 / 2^22` is slightly below it (and _not_ a multiple of `n + 1/2`), hence `round-(f1' / B) = round(f1' * 1025 / 2^22)`; where the round-down on the LHS is essential, and on the RHS the type of rounding again does not matter.

jammychiou1 · 2025-12-01T07:33:05Z

dev/x86_64/src/poly_decompose_88_avx2.c

-     * 1 / 1488.
+     * 1 / 1488, so (for example) f1' = B / 2 = 744 is mapped to
+     *
+     *   round(744 * 11275 / 2^24) = round(744 * (1 / 1488 - epsilon))
+     *                             = round(1 / 2 - epsilon') = 0,
+     *
+     * where epsilon = 1 / 1488 - 11275 / 2^24 and epsilon' = 744 * eps are both
+     * tiny but positive numbers.


jammychiou1 force-pushed the decompose-explanation-fix branch from 0b47c08 to 9931203 Compare December 1, 2025 03:34

jammychiou1 added 2 commits December 1, 2025 15:21

[DRAFT] Better explanation for round() vs round-() in decompose

326d731

Only the AVX2 one is updated currently. The NEON one will also be updated if this looks good. Signed-off-by: jammychiou1 <jammy.chiou1@gmail.com>

jammychiou1 force-pushed the decompose-explanation-fix branch from 0be3a84 to 326d731 Compare December 1, 2025 07:21

jammychiou1 commented Dec 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix some minor details in comments for AVX2 decompose #762

Fix some minor details in comments for AVX2 decompose #762

Uh oh!

jammychiou1 commented Dec 1, 2025

Uh oh!

jammychiou1 commented Dec 1, 2025

Uh oh!

jammychiou1 Dec 1, 2025

Uh oh!

hanno-becker Dec 4, 2025

Uh oh!

jammychiou1 Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix some minor details in comments for AVX2 decompose #762

Are you sure you want to change the base?

Fix some minor details in comments for AVX2 decompose #762

Uh oh!

Conversation

jammychiou1 commented Dec 1, 2025

Uh oh!

jammychiou1 commented Dec 1, 2025

Uh oh!

jammychiou1 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

hanno-becker Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

jammychiou1 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants