Skip to content

Conversation

@ariedel-cern
Copy link
Contributor

Following up on #13964

@github-actions
Copy link
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

Copy link
Collaborator

@wiechula wiechula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ariedel-cern . Please check for compiler warnings and fix them.

auto f = std::unique_ptr<TFile>(TFile::Open(filename.data(), "recreate"));
TObjArray arr;
arr.SetName("GPUErrorQA_Hists");
for (const auto& [name, hist] : mMapHist) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid warning.

Suggested change
for (const auto& [name, hist] : mMapHist) {
for ([[maybe_unused]] const auto& [name, hist] : mMapHist) {

Copy link
Contributor Author

@ariedel-cern ariedel-cern Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Looking at other PRs, the full CI seems to be broken at the moment. Testing locally, I see no compiler warnings.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for 41445c2 at 2025-06-12 14:55:

input queue has size 75
output queue has size 1
input queue has size 67
output queue has size 1
input queue has size 54
output queue has size 1
input queue has size 47
output queue has size 1
input queue has size 39
output queue has size 1
input queue has size 24
output queue has size 1
input queue has size 18
output queue has size 1
input queue has size 7
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
++ cp thinned_compile_commands.json compile_commands.json
++ CHECKS='-*,modernize-avoid-bind,modernize-deprecated-headers,modernize-make-shared,modernize-raw-string-literal,modernize-redundant-void-arg,modernize-replace-auto-ptr,modernize-replace-random-shuffle,modernize-shrink-to-fit,modernize-unary-static-assert,modernize-use-equals-default,modernize-use-noexcept,modernize-use-nullptr,modernize-use-override,modernize-use-transparent-functors,modernize-use-uncaught-exceptions,readability-braces-around-statements'
++ tee error-log.txt
+++ which O2codecheck
+++ find /sw/slc9_x86-64/GCC-Toolchain/v14.2.0-alice2-1/lib -name crtbegin.o -exec dirname '{}' ';'
++ run_O2CodeChecker.py -j 32 -clang-tidy-binary /sw/slc9_x86-64/o2codechecker/v18.1.2.1-local1/bin/O2codecheck -clang-apply-replacements-binary /sw/slc9_x86-64/Clang/v18.1.8-21/bin-safe/clang-apply-replacements -extra-args=--extra-arg=--gcc-install-dir=/sw/slc9_x86-64/GCC-Toolchain/v14.2.0-alice2-1/lib/gcc/x86_64-unknown-linux-gnu/14.2.0 '-header-filter=.*SOURCES(?!.*/3rdparty/).*' '-checks=-*,modernize-avoid-bind,modernize-deprecated-headers,modernize-make-shared,modernize-raw-string-literal,modernize-redundant-void-arg,modernize-replace-auto-ptr,modernize-replace-random-shuffle,modernize-shrink-to-fit,modernize-unary-static-assert,modernize-use-equals-default,modernize-use-noexcept,modernize-use-nullptr,modernize-use-override,modernize-use-transparent-functors,modernize-use-uncaught-exceptions,readability-braces-around-statements'
No checks enabled.
Unable to run clang-tidy.

Full log here.

@alibuild
Copy link
Collaborator

alibuild commented Jun 12, 2025

Error while checking build/O2/fullCI_slc9 for 46787b5 at 2025-06-15 21:24:

input queue has size 76
output queue has size 1
input queue has size 68
output queue has size 1
input queue has size 56
output queue has size 1
input queue has size 48
output queue has size 1
input queue has size 40
output queue has size 1
input queue has size 25
output queue has size 1
input queue has size 21
output queue has size 1
input queue has size 7
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
input queue has size 0
output queue has size 1
++ cp thinned_compile_commands.json compile_commands.json
++ CHECKS='-*,modernize-avoid-bind,modernize-deprecated-headers,modernize-make-shared,modernize-raw-string-literal,modernize-redundant-void-arg,modernize-replace-auto-ptr,modernize-replace-random-shuffle,modernize-shrink-to-fit,modernize-unary-static-assert,modernize-use-equals-default,modernize-use-noexcept,modernize-use-nullptr,modernize-use-override,modernize-use-transparent-functors,modernize-use-uncaught-exceptions,readability-braces-around-statements'
++ tee error-log.txt
+++ which O2codecheck
+++ find /sw/slc9_x86-64/GCC-Toolchain/v14.2.0-alice2-1/lib -name crtbegin.o -exec dirname '{}' ';'
++ run_O2CodeChecker.py -j 32 -clang-tidy-binary /sw/slc9_x86-64/o2codechecker/v18.1.2.1-local1/bin/O2codecheck -clang-apply-replacements-binary /sw/slc9_x86-64/Clang/v18.1.8-21/bin-safe/clang-apply-replacements -extra-args=--extra-arg=--gcc-install-dir=/sw/slc9_x86-64/GCC-Toolchain/v14.2.0-alice2-1/lib/gcc/x86_64-unknown-linux-gnu/14.2.0 '-header-filter=.*SOURCES(?!.*/3rdparty/).*' '-checks=-*,modernize-avoid-bind,modernize-deprecated-headers,modernize-make-shared,modernize-raw-string-literal,modernize-redundant-void-arg,modernize-replace-auto-ptr,modernize-replace-random-shuffle,modernize-shrink-to-fit,modernize-unary-static-assert,modernize-use-equals-default,modernize-use-noexcept,modernize-use-nullptr,modernize-use-override,modernize-use-transparent-functors,modernize-use-uncaught-exceptions,readability-braces-around-statements'
No checks enabled.
Unable to run clang-tidy.

Full log here.

// get gpu error names
// copied from GPUErrors.h
static std::unordered_map<uint32_t, const char*> errorNames = {
#define GPUCA_ERROR_CODE(num, name, ...) {num, GPUCA_M_STR(name)},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just include GPUErrors.h instead of copy & pasting the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi David, the errorNames were defined in GPUError.cxx, not in the header. I was not sure if this was done on purpose, so that's why I copied the code.
I moved the definition now to the header, so I can include it in the QA task.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was confused due to your comment // copied from GPUErrors.h. Indeed, the definition was in the CXX, and I'd prefer it to stay in the CXX. No need to full the header file with the strings.
Could you put a

#ifndef GPUCA_GPUCODE_DEVICE
static std::unordered_map<uint32_t, const char*> errorNames
#endif

in GPUErrors.h, but leave the actual definition in the cxx.
And please also protect the <unordered_map> include in the header with the same ifdef.

// for convienence, label each bin with the error name
for (size_t bin = 1; bin < mMapHist["ErrorCounter"]->GetNbinsX(); bin++) {
auto const& it = o2::gpu::errorNames.find(bin);
mMapHist["ErrorCounter"]->GetXaxis()->SetBinLabel(bin, it->second);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case we will have a gap in the error numbers in the future, I would propose to check if it is a valid error Name.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for 7b5264a at 2025-06-17 10:44:

## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14397-slc9_x86-64/0/GPU/GPUTracking/Global/GPUErrors.h:20:10: fatal error: 'unordered_map' file not found
/sw/SOURCES/O2/14397-slc9_x86-64/0/GPU/GPUTracking/Global/GPUErrors.h:20:10: fatal error: unordered_map: No such file or directory
ninja: build stopped: subcommand failed.

Full log here.

Copy link
Collaborator

@davidrohr davidrohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new approach looks good in general, but see the 2 comments below:

memset(mErrors, 0, GPUCA_MAX_ERRORS * sizeof(*mErrors));
}

static std::unordered_map<uint32_t, const char*> errorNames = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this as static variable inside getErrorNames, so that it is not a global symbol?
Alternatively, please put it in o2::gpu::internal namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

#include "GPUCommonDef.h"
#include "GPUDefMacros.h"
#include <unordered_map>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, you need to protect unordered_map with #ifndeg GPUCA_GPUCODE_DEVICE.
We do not include any std headers in GPU kernel code.
And I think you don't need GPUDefMacros.h, do you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I also fixed up the includes.

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for 8f676bc at 2025-06-17 18:10:

## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14397-slc9_x86-64/0/GPU/GPUTracking/Global/GPUErrors.h:20:10: fatal error: unordered_map: No such file or directory
/sw/SOURCES/O2/14397-slc9_x86-64/0/GPU/GPUTracking/Global/GPUErrors.h:20:10: fatal error: 'unordered_map' file not found
ninja: build stopped: subcommand failed.

Full log here.

@alibuild
Copy link
Collaborator

alibuild commented Jun 18, 2025

Error while checking build/O2/fullCI_slc9 for 7fd0535 at 2025-07-01 01:46:

## sw/BUILD/O2-latest/log
/sw/SOURCES/O2/14397-slc9_x86-64/0/GPU/GPUTracking/Global/GPUErrors.h:41:21: error: no template named 'unordered_map' in namespace 'std'
ninja: build stopped: subcommand failed.

Full log here.

@ariedel-cern
Copy link
Contributor Author

Hi @davidrohr , thanks a lot for all the feedback. Is this PR now OK to be merged? I think the 2 failed checks are due to some issue with O2Physics.

@davidrohr
Copy link
Collaborator

Hi @ariedel-cern : The FullCI error is genuine, you also have to #ifdef-protect
static const std::unordered_map<uint32_t, const char*>& getErrorNames();

@davidrohr davidrohr merged commit c2f066a into AliceO2Group:dev Jul 3, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants