Skip to content

Conversation

@petrosagg
Copy link
Contributor

@petrosagg petrosagg commented Apr 23, 2025

This PR improves the routines that convert f64 values into Decimal<N> values by avoiding allocating large strings. The performance comes from:

  • Serializing the float using the scientific notation, which uses at most 24 characters as opposed to ~800 characters of the only decimal notation.
  • Avoiding all roundtrips to the allocator by writing into a small stack allocated buffer

I have included a benchmark for the from_f64 method which shows a 4x improvement:

parse_decimal           time:   [658.76 ns 661.73 ns 665.05 ns]                           
                        change: [-78.751% -78.645% -78.534%] (p = 0.00 < 0.05)
                        Performance has improved.

@petrosagg petrosagg requested a review from antiguru April 23, 2025 18:31
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
@petrosagg petrosagg force-pushed the non-allocating-parsing branch from e81561c to 0d76ca8 Compare April 29, 2025 13:23
Copy link

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

// * 3 bytes for the largest possible exponent (308)
// An example of such maximal float value is f64::from_bits(0x8008000000000000) whose
// decimal representation is '-1.1125369292536007e-308'
const MAX_LEN: usize = 24;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me! Just double checked and ryu also sizes their buffer with 24 bytes so these seems quite safe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see it validated elsewhere!


let mut buf = [0u8; MAX_LEN + 1];
let mut unwritten = &mut buf[..MAX_LEN];
write!(unwritten, "{:e}", n).unwrap();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the write! macro has a decent amount of machinery internally that it might be nice to bypass, not a huge priority though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked on slack, doesn't seem to be much we do here so leaving it as-is

// decimal representation is '-1.1125369292536007e-308'
const MAX_LEN: usize = 24;

let mut buf = [0u8; MAX_LEN + 1];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this +1 is to ensure the last byte is a null character for the C String? mind documenting that?

@petrosagg petrosagg force-pushed the non-allocating-parsing branch from 0d76ca8 to 132d7ae Compare April 29, 2025 14:24
Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The f32 conversion is not correct.

fn main() {
    println!("{}", 4.8f32 as f64);
    println!("{}", 4.8f64 as f64);
}

prints

4.800000190734863
4.8

@petrosagg petrosagg force-pushed the non-allocating-parsing branch from 132d7ae to b906875 Compare April 29, 2025 15:44
@petrosagg
Copy link
Contributor Author

@antiguru good catch! Pushed a change to fix this. I added a generic method for both float sizes that calls into the appropriate stringify method

Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the f32 issue!

self.from_float(n)
}

/// Converts an `f64` to a `Decimal<N>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description is now incorrect. Also, we should ensure that we only pass f32|f64 here, and not accidentally f128 in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I'll fix the description. I'm not super concerned about ensuring statically that T is either f32 or f64 since you can't cause UB by doing that but I will document it for the future engineer

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
@petrosagg petrosagg force-pushed the non-allocating-parsing branch from b906875 to 2af10a8 Compare April 29, 2025 18:12
@petrosagg petrosagg enabled auto-merge (rebase) April 29, 2025 18:12
@petrosagg petrosagg disabled auto-merge April 29, 2025 18:29
@petrosagg petrosagg enabled auto-merge (rebase) April 29, 2025 18:31
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
@petrosagg petrosagg force-pushed the non-allocating-parsing branch from 4d012b1 to c7db9da Compare April 30, 2025 08:22
@petrosagg petrosagg merged commit 37c0632 into master May 1, 2025
6 checks passed
@petrosagg petrosagg deleted the non-allocating-parsing branch May 1, 2025 11:27
petrosagg added a commit that referenced this pull request May 7, 2025
This PR fixes a regression introduced in #83 which started serializing
floats using their scientific notation, which is much shorter than the
decimal notation and therefore faster.

It turns out that conversion from string to decimal is syntactic
instead of semantic. This means that the strings "12" and "1.2E2", which
are semantically the same number, parse into a different Decimal
instance.

This PR fixes the regression by serializing to the decimal
representation.

The overall improvement from 0.4.9 is much more mild as a result.

```
parse_decimal           time:   [2.4036 µs 2.4180 µs 2.4389 µs]
                        change: [-22.670% -21.979% -21.336%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
petrosagg added a commit that referenced this pull request May 7, 2025
This PR fixes a regression introduced in #83 which started serializing
floats using their scientific notation, which is much shorter than the
decimal notation and therefore faster.

It turns out that conversion from string to decimal is syntactic
instead of semantic. This means that the strings "12" and "1.2E2", which
are semantically the same number, parse into a different Decimal
instance.

This PR fixes the regression by serializing to the decimal
representation.

The overall improvement from 0.4.9 is much more mild as a result.

```
parse_decimal           time:   [2.4036 µs 2.4180 µs 2.4389 µs]
                        change: [-22.670% -21.979% -21.336%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
petrosagg added a commit that referenced this pull request May 7, 2025
This PR fixes a regression introduced in #83 which started serializing
floats using their scientific notation, which is much shorter than the
decimal notation and therefore faster.

It turns out that conversion from string to decimal is syntactic
instead of semantic. This means that the strings "12" and "1.2E2", which
are semantically the same number, parse into a different Decimal
instance.

This PR fixes the regression by serializing to the decimal
representation.

The overall improvement from 0.4.9 is much more mild as a result.

```
parse_decimal           time:   [2.4036 µs 2.4180 µs 2.4389 µs]
                        change: [-22.670% -21.979% -21.336%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants