-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
perf (tsort) : avoid reading the whole input into memory and intern strings #9872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
CodSpeed Performance ReportMerging #9872 will improve performance by 10.61%Comparing Summary
Benchmarks breakdown
Footnotes
|
|
GNU testsuite comparison: |
81ece9b to
0a56fe6
Compare
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
ddb2361 to
b98950c
Compare
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
any idea why the perf regressed here? |
Cargo.toml
Outdated
| uu_base32 = { version = "0.5.0", path = "src/uu/base32" } | ||
| uutests = { version = "0.5.0", package = "uutests", path = "tests/uutests" } | ||
|
|
||
| string-interner = "0.19.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't the right place to declare it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, where is that?
Cargo.toml
Outdated
| whoami = { optional = true, version = "0.5.0", package = "uu_whoami", path = "src/uu/whoami" } | ||
| yes = { optional = true, version = "0.5.0", package = "uu_yes", path = "src/uu/yes" } | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this change
|
Oh, @sylvestre , I just added a few changes :) |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
@sylvestre I believe the regression noticed are due to cache misses because the metadata that goes with the token is just slightly bigger than usize. Here's another benchmark that maybe of interest: graph.tsort, 1 Gb random graph
Massif output:
Gnu tsort:
MB (Memory)
248.6^ #
| ::::::::::@::#
| ::::::::::::::::::: ::: ::::@: #
| :::::@:::: ::::: : ::::: ::: ::: ::::@: #
| @@:: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| ::::@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| @@::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| ::::::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :::::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :::: :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :::: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :::: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| : :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| :: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| @@:: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
| @ :: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
0 +----------------------------------------------------------------------->Gi (Giga Instructions)
0 23.47
Number of snapshots: 56
Detailed snapshots: [1, 2, 16, 21, 26, 52, 55 (peak)]
uutils tsort
MB (Memory)
200.0^ #
| @::::@:::::::::::@::::@::::@::::@:::#:
| :::::::::::@::::::@::::@::: :::::: @::::@::::@::::@:::#:
| ::::::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:
| :::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#::
| ::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#::
| @:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| @::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| ::@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| ::@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
| ::@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
0 +----------------------------------------------------------------------->Gi (Giga Instructions)
0 35.83
Number of snapshots: 98
Detailed snapshots: [3, 6, 24, 31, 37, 50, 51, 52, 62, 72, 82, 91 (peak)]
Hyperfine :
Benchmark 1: target/release/coreutils tsort /home/bench_vm/graph.tsort
Time (mean ± σ): 6.044 s ± 0.071 s [User: 5.602 s, System: 0.394 s]
Range (min … max): 5.974 s … 6.168 s 10 runs
Benchmark 2: tsort /home/bench_vm/graph.tsort
Time (mean ± σ): 7.115 s ± 0.147 s [User: 6.738 s, System: 0.328 s]
Range (min … max): 6.910 s … 7.390 s 10 runs
`
Summary: 'target/release/coreutils tsort /home/bench_vm/graph.tsort'
ran 1.18 ± 0.03 times faster than 'tsort /home/bench_vm/graph.tsort'
` |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
306c01c to
8439c5a
Compare
|
GNU testsuite comparison: |
1 similar comment
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
da98aae to
380e2aa
Compare
|
GNU testsuite comparison: |
This PR is a WIP to get more performance out of tsort.
Move to usize is complete. Next step is to move to vec instead of hashmap