Skip to content

Conversation

@JaneIllario
Copy link

Salut, merci pour la review! 😄

In #9948 the unexpand utilities diverges from the GNU behavior with multi-byte characters. This fix replaces the UnicodeWidthChar with nbytes to maintain the same compatibility with the GNU unexpand. I also added an integration test based on the busybox test and the issue description to make sure that we don't regress to the same behavior.

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 31, 2025

CodSpeed Performance Report

Merging #9949 will improve performance by 6.85%

Comparing JaneIllario:unexpand-unicode (2ded1fe) with main (fd68328)1

Summary

⚡ 2 improvements
✅ 134 untouched
⏩ 15 skipped2

Benchmarks breakdown

Benchmark BASE HEAD Efficiency
unexpand_many_lines[100000] 269.6 ms 252.3 ms +6.85%
unexpand_large_file[10] 565.2 ms 529 ms +6.85%

Footnotes

  1. No successful run was found on main (c8c412c) during the generation of this report, so fd68328 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 15 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unexpand: -a uses Unicode display width instead of byte count for multibyte characters

1 participant