Skip to content

Conversation

@ChSonnabend
Copy link
Collaborator

Improves GPU kernel speed by ~30%

@ChSonnabend ChSonnabend requested a review from davidrohr as a code owner July 13, 2025 23:18
@github-actions
Copy link
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

Please consider the following formatting changes to AliceO2Group#14510
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for 560f77f at 2025-07-14 02:40:

## sw/BUILD/e634489fc902e8cb213d4d883e791c07a9571dc8-latest/log
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/DataTypes/GPUTPCGeometry.h:114:98: error: array subscript 152 is above array bounds of 'const uint8_t [152]' {aka 'const unsigned char [152]'} [-Werror=array-bounds=]
make[2]: *** [GPU/GPUTracking/CMakeFiles/GPUTracking.dir/build.make:1161: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/TPCClusterFinder/GPUTPCNNClusterizerHost.cxx.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:606: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/all] Error 2
make: *** [Makefile:136: all] Error 2


## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/DataTypes/GPUTPCGeometry.h:114:98: error: array subscript 152 is above array bounds of 'const uint8_t [152]' {aka 'const unsigned char [152]'} [-Werror=array-bounds=]
make[2]: *** [GPU/GPUTracking/CMakeFiles/GPUTracking.dir/build.make:1161: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/TPCClusterFinder/GPUTPCNNClusterizerHost.cxx.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:606: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Full log here.

@ChSonnabend
Copy link
Collaborator Author

@drohr, Build error seems unrelated and not sure what exactly it would fail at... The maximum value of uint8 would be 255 if I am not mistaking

@davidrohr
Copy link
Collaborator

I don't think the error is unrelated. It tells you that you are accessing gputpcgeometry_internal::mNPads[152] which is out of bounds.

@ChSonnabend
Copy link
Collaborator Author

I don't think the error is unrelated. It tells you that you are accessing gputpcgeometry_internal::mNPads[152] which is out of bounds.

Indeed, sorry for the noise. Not sure how I didn't spot that.

Please consider the following formatting changes to AliceO2Group#14510
@ChSonnabend ChSonnabend marked this pull request as draft July 14, 2025 07:56
@ChSonnabend ChSonnabend changed the title NN clusterizer: Change to Lookup table NN clusterizer: Improve filling kernel speed Jul 16, 2025
@ChSonnabend ChSonnabend marked this pull request as ready for review July 16, 2025 15:51
@ChSonnabend
Copy link
Collaborator Author

These are the best improvements I could find so far. @drohr can merge for now once the CI is green, please?

@alibuild
Copy link
Collaborator

alibuild commented Jul 16, 2025

Error while checking build/O2/fullCI_slc9 for 069a7e9 at 2025-07-17 16:09:

## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx:158:74: error: comparison of integer expressions of different signedness: 'int32_t' {aka 'int'} and 'uint32_t' {aka 'unsigned int'} [-Werror=sign-compare]
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx:188:32: error: comparison of integer expressions of different signedness: 'int32_t' {aka 'int'} and 'uint32_t' {aka 'unsigned int'} [-Werror=sign-compare]
make[2]: *** [GPU/GPUTracking/CMakeFiles/GPUTracking.dir/build.make:1133: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:606: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Full log here.

@drohr
Copy link

drohr commented Jul 16, 2025

These are the best improvements I could find so far. @drohr can merge for now once the CI is green, please?

Please ping @davidrohr and not @drohr over and over. We might share the same nice name, but we are not the same person :-)

@ChSonnabend
Copy link
Collaborator Author

@davidrohr Ready to merge once the CI is green. Not sure what is currently going on with the build container, but it builds locally

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for 4949b55 at 2025-07-18 02:58:

## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx:68:67: error: no matching function for call to 'o2::gpu::GPUCommonMath::Min(uint32_t, o2::gpu::tpccf::SizeT)'
ninja: build stopped: subcommand failed.

Full log here.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for f4729fd at 2025-07-18 10:26:

## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx:176:74: error: comparison of integer expressions of different signedness: 'int32_t' {aka 'int'} and 'uint32_t' {aka 'unsigned int'} [-Werror=sign-compare]
/sw/SOURCES/O2/14510-slc9_x86-64/0/GPU/GPUTracking/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx:206:32: error: comparison of integer expressions of different signedness: 'int32_t' {aka 'int'} and 'uint32_t' {aka 'unsigned int'} [-Werror=sign-compare]
make[2]: *** [GPU/GPUTracking/CMakeFiles/GPUTracking.dir/build.make:1133: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/TPCClusterFinder/GPUTPCNNClusterizerKernels.cxx.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:606: GPU/GPUTracking/CMakeFiles/GPUTracking.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Full log here.

@davidrohr davidrohr merged commit da3a178 into AliceO2Group:dev Jul 18, 2025
11 checks passed
@ChSonnabend ChSonnabend deleted the nn_clusterizer_lookup_boundary branch July 18, 2025 21:49
mhemmer-cern pushed a commit to mhemmer-cern/AliceO2 that referenced this pull request Sep 9, 2025
* First version of lookup tables

* Simplifying computations + bug-fixes

* Fixes for indexing and offsets

* Adjusting CPU kernel

* Please consider the following formatting changes

* Fix for row-number access

* Please consider the following formatting changes

* Improve kernel speed by ~15%. Next test: for-loop in pad direction for coallesced access

* IMproving kernel speed by 30% compared to original version. Next try: for-loop over row dimension as access is somewhat coalsced too

* Please consider the following formatting changes

* Minor improvements for MC handling

* Beautifications to trigger the CI

* Compile-fix

* Fix int32_t error in fullCI build

---------

Co-authored-by: ALICE Action Bot <alibuild@cern.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants