-
Notifications
You must be signed in to change notification settings - Fork 32
Initial benchmark command #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
updated outputuvx \
--from git+https://github.com/huggingface/kernels.git@initial-benchmark-command \
--with torch \
--with numpy \
kernels benchmark kernels-community/activation # <- the expected command once mergedoutput kernels benchmark kernels-community/activation # <- the expected command once merged
Updated https://github.com/huggingface/kernels.git (a78fc45da9c22ac06411fb282570a8d7123fb412)
Built kernels @ git+https://github.com/huggingface/kernels.git@a78fc45da9c22ac06411fb282570a8d7123fb412
Installed 45 packages in 73ms
Downloading kernels-community/activation@main...
Running benchmark.py...
┌────────────────┬──────────┬───────┬─────────┬────────────┬────────────┬────────────┬────────────┬────────────┬──────────┬────────────┬───────┐
│ Benchmark │ Workload │ N │ Speedup │ Mean(ms) │ Std(ms) │ Min(ms) │ Max(ms) │ IQR(ms) │ Outliers │ Ref(ms) │ Match │
├────────────────┼──────────┼───────┼─────────┼────────────┼────────────┼────────────┼────────────┼────────────┼──────────┼────────────┼───────┤
│ SiluWorkloads │ medium │ 100 │ 6.93x │ 0.0109 │ 0.0002 │ 0.0107 │ 0.0123 │ 0.0002 │ 4 │ 0.0755 │ ✓ │
│ SiluWorkloads │ small │ 100 │ 10.62x │ 0.0053 │ 0.0011 │ 0.0050 │ 0.0159 │ 0.0002 │ 6 │ 0.0563 │ ✓ │
│ SiluWorkloads2 │ medium │ 100 │ │ 0.0152 │ 0.0018 │ 0.0138 │ 0.0257 │ 0.0004 │ 14 │ │ · │
│ SiluWorkloads2 │ small │ 100 │ 10.60x │ 0.0103 │ 0.0006 │ 0.0095 │ 0.0136 │ 0.0004 │ 5 │ 0.1092 │ ✓ │
└────────────────┴──────────┴───────┴─────────┴────────────┴────────────┴────────────┴────────────┴────────────┴──────────┴────────────┴───────┘
medium: 6.93x faster (95% CI: 0.0109-0.0109ms vs ref 0.0755ms) ✓ significant
small: 10.62x faster (95% CI: 0.0051-0.0055ms vs ref 0.0563ms) ✓ significant
small: 10.60x faster (95% CI: 0.0102-0.0104ms vs ref 0.1092ms) ✓ significant
Kernel: 6ff8e1a Benchmark: 9b68fca
Dry run - use --upload to submit results
|
|
latest changes include adding a benchmark class for attention which can be used with uv run kernels benchmark kernels-community/flash-attn2file https://huggingface.co/kernels-community/flash-attn2/blob/main/benchmark.py uv run kernels benchmark kernels-community/flash-attn3file https://huggingface.co/kernels-community/flash-attn3/blob/main/benchmark.py uv run kernels benchmark kernels-community/vllm-flash-attn3file https://huggingface.co/kernels-community/vllm-flash-attn3/blob/main/benchmark.py and activation also has a benchmark uv run kernels benchmark kernels-community/activationfile https://huggingface.co/kernels-community/activation/blob/main/benchmark.py |
|
the following benches now run and use pre defined benchmarks that contain logic in this branch and pointer files in the kernel repos uv run kernels benchmark kernels-community/activation
uv run kernels benchmark kernels-community/flash-attn2
uv run kernels benchmark kernels-community/flash-attn3
uv run kernels benchmark kernels-community/vllm-flash-attn3https://huggingface.co/kernels-community/activation/blob/main/benchmark.py Future related workThis PR is the first step to enable more benchmarking features to kernels. Following PRs will continue to add
|
This is a work in progress PR to add a new benchmark command to the kernels cli tool. The idea is to enable a standard way to benchmark kernels
uvx \ --from git+https://github.com/huggingface/kernels.git@initial-benchmark-command \ --with torch \ --with numpy \ kernels benchmark kernels-community/activation # <- the expected command once mergedoutput
Updated https://github.com/huggingface/kernels.git (daa75e4edfaccca487b9de9fb2b85b4cd052fd42) Built kernels @ git+https://github.com/huggingface/kernels.git@daa75e4edfaccca487b9de9fb2b85b4cd052fd42 Installed 45 packages in 81ms Downloading kernels-community/activation@main... Running benchmark.py... ┌────────────────┬──────────┬────────────┬────────────┬────────────┬────────────┬────────────┬──────────┬───────────┐ │ Benchmark │ Workload │ Mean(ms) │ Std(ms) │ Min(ms) │ Max(ms) │ IQR(ms) │ Outliers │ Ref Match │ ├────────────────┼──────────┼────────────┼────────────┼────────────┼────────────┼────────────┼──────────┼───────────┤ │ SiluWorkloads │ medium │ 0.0109 │ 0.0003 │ 0.0105 │ 0.0130 │ 0.0002 │ 3 │ ✓ │ │ SiluWorkloads │ small │ 0.0051 │ 0.0005 │ 0.0049 │ 0.0095 │ 0.0002 │ 3 │ ✓ │ │ SiluWorkloads2 │ medium │ 0.0298 │ 0.0017 │ 0.0282 │ 0.0390 │ 0.0010 │ 5 │ · │ │ SiluWorkloads2 │ small │ 0.0061 │ 0.0025 │ 0.0055 │ 0.0305 │ 0.0001 │ 7 │ ✓ │ └────────────────┴──────────┴────────────┴────────────┴────────────┴────────────┴────────────┴──────────┴───────────┘ Dry run - use --upload to submit resultsbenchmark file