Skip to content

Conversation

@naoNao89
Copy link
Contributor

Adds GNU-compatible thousands separator formatting to ls, du, and df. Use a leading quote in --block-size to get readable output: --block-size="'1" shows 1,024,000 instead of 1024000.

Respects LC_NUMERIC locale (comma for en_US, period for European locales, none for C/POSIX). Works with environment variables too (LS_BLOCK_SIZE, DU_BLOCK_SIZE, DF_BLOCK_SIZE, etc.).

Core changes in uucore (parse_size.rs, human.rs) handle the parsing and formatting. Each utility integrates it with minimal changes. 12 new integration tests, all existing tests pass.

Closes #9084

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch 3 times, most recently from 3ce60cf to 48f4bc1 Compare October 30, 2025 12:40
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch 2 times, most recently from a8966af to ae09fec Compare October 30, 2025 14:15
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 closed this by deleting the head repository Nov 6, 2025
@naoNao89 naoNao89 reopened this Nov 7, 2025
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

sylvestre commented Dec 26, 2025

could you please split this per program in different pr ? thanks
it is harder to review otherwise

/// - `','` for other locales (default, en_US style)
fn get_thousands_separator() -> char {
// Try to read LC_NUMERIC or LANG environment variable
if let Ok(locale) = std::env::var("LC_NUMERIC")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Environment variable reads on every call is inefficient - consider caching the locale information


// Simple heuristic: European locales use period, others use comma
// This covers common cases like de_DE, fr_FR, it_IT, es_ES, nl_NL, etc.
if locale.starts_with("de_")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it?
i don't like this hardcoded list. isn't something in icu to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oki, replace with ICU's FixedDecimalFormatter

@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 26, 2025
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 26, 2025
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 8b92113 to fb15063 Compare December 26, 2025 02:35
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 26, 2025
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from fb15063 to 13225ab Compare December 26, 2025 02:35
@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 26, 2025

CodSpeed Performance Report

Merging #9090 will not alter performance

Comparing naoNao89:feature/thousands-separator-df (b854560) with main (2916d2b)

Summary

✅ 136 untouched
⏩ 6 skipped1

Footnotes

  1. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 8aa4f8c to c6cb13e Compare December 26, 2025 10:39
@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

1 similar comment
@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 29d396b to e85ae66 Compare December 26, 2025 11:40
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from e85ae66 to 714a5a2 Compare December 26, 2025 15:25
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 714a5a2 to 3be82c4 Compare December 26, 2025 18:36
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from faf2e81 to eae9883 Compare December 27, 2025 11:18
@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/csplit/csplit-heap is now being skipped but was previously passing.
Note: The gnu test tests/cut/cut-huge-range is now being skipped but was previously passing.
Note: The gnu test tests/printf/printf-surprise is now being skipped but was previously passing.

@naoNao89 naoNao89 changed the title feat(ls,du,df): add thousands separator support feat(df): add thousands separator support Dec 27, 2025
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from eae9883 to c46965b Compare December 27, 2025 19:57
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/cp/cp-mv-enotsup-xattr. tests/cp/cp-mv-enotsup-xattr is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/csplit/csplit-io-err was skipped on 'main' but is now failing.

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch 2 times, most recently from 6be679f to bb7d375 Compare December 27, 2025 20:33
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from bb7d375 to a9b8ee7 Compare December 28, 2025 13:09
@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/tail/assert is no longer failing!

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from a9b8ee7 to e7966cf Compare December 28, 2025 19:43
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?

@naoNao89
Copy link
Contributor Author

i found that locale_decimal_separator is defined but NOT used anywhere in the current main branch

- Change get_decimal_separator() to accept &Locale instead of Locale
- Eliminates problematic .clone() that caused stale thread stack memory
- Fixes valgrind failures in tests/sort/sort-stale-thread-mem test
- More efficient (avoids unnecessary clone operation)
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from e7966cf to ecfd22f Compare December 28, 2025 23:54
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 marked this pull request as ready for review December 29, 2025 16:05
alexandrefresnais and others added 9 commits December 31, 2025 00:30
build-gnu.sh: Use MULTICALL=y and skip not used utils for faster build
* CICD.yml: Avoid no space left much more

* Remove both of android and dotnet

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>

---------

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
…s#9136)

* feat(dd): add comprehensive benchmark suite for O_DIRECT optimization

- Create dd's first benchmark suite using divan framework
- Benchmark various block sizes (4K, 8K, 64K, 1M) to measure performance
- Test different dd scenarios: default, partial copy, skip, seek operations
- Measure impact of separate input/output block sizes
- All benchmarks use status=none to avoid output noise
- Benchmarks verify the O_DIRECT buffer alignment optimization
- Follows existing uutils benchmark patterns and conventions

* bench(dd): increase dataset sizes for consistent timing

Increase benchmark dataset sizes to achieve consistent 100-300ms timing:
- dd_copy_default: 16 -> 32 MB
- dd_copy_4k_blocks: 16 -> 24 MB
- dd_copy_64k_blocks: 16 -> 64 MB
- dd_copy_1m_blocks: 16 -> 128 MB
- dd_copy_separate_blocks: 16 -> 48 MB
- dd_copy_partial: 16 -> 32 MB
- dd_copy_with_skip: 16 -> 48 MB
- dd_copy_with_seek: 16 -> 48 MB
- dd_copy_8k_blocks: 16 -> 32 MB

This ensures stable, repeatable benchmark measurements across different systems.

---------

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
No need to use the unsafe `libc::execvp()`, the standard rust library provides
the functionality via the safe function `Command::exec()`.

Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
…tils#9884)

* du: fix -l/--count-links option not counting hardlinks separately

* du: add test for -l/--count-links counting hardlinks separately

---------

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
…nt dirs (uutils#9803)

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
…gile configurations

Addresses PR uutils#9653 review feedback about 'fragile commands' by implementing
fail-fast validation in Parser builder pattern.

Changes:
- Add ParserBuilderError enum with 8 validation error variants
- Refactor builder methods to return Result<&mut Self, ParserBuilderError>
- Implement comprehensive unit validation (57 valid units including k/m/g/t)
- Add cross-validation between builder settings (default_unit vs allow_list)
- Detect conflicts (b_byte_count with 'b' unit, '%' with size units)
- Update all call sites (sort, du, df, od) to handle new error types
- Fix invalid '%' unit in sort's parse_byte_count allow_list

Benefits:
- Configuration errors detected immediately (not during parsing)
- Clear error messages listing invalid/conflicting settings
- Maintains backward compatibility through explicit error reporting

Files modified:
- src/uucore/src/lib/features/parser/parse_size.rs (core implementation)
- src/uu/sort/src/sort.rs (error handling + fix invalid '%')
- src/uu/du/src/du.rs (error handling)
- src/uu/df/src/df.rs (error handling)
- src/uu/od/src/od.rs (error handling)
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 9216589 to b854560 Compare December 31, 2025 00:33
@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/tty/tty-eof is no longer failing!

@anastygnome
Copy link
Contributor

@naoNao89 there seems to be changes unrelated to the PR, (dd, install), is that a rebase issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ls (and du) block size specification with thousands separator fails

9 participants