-
Notifications
You must be signed in to change notification settings - Fork 63
Feature: shmem init error checking #1997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…it_error_checking
…it_error_checking
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1997 +/- ##
=======================================
Coverage 76.60% 76.60%
=======================================
Files 109 109
Lines 18721 18724 +3
=======================================
+ Hits 14341 14344 +3
Misses 4380 4380 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…it_error_checking
…' into feature-shmem_init_error_checking
…g' into feature-shmem_init_error_checking
…it_error_checking
| #ifndef T8_ENABLE_MPI | ||
| // If we do not use MPI, there is nothing to do. | ||
| // We only have a single process. | ||
| return 1; | ||
| #endif | ||
| #ifndef SC_ENABLE_MPICOMMSHARED | ||
| SC_ABORT ("Trying to use shared memory but SC_ENABLE_MPICOMMSHARED is not set. This should not happen if you use MPI " | ||
| "v.3.0 or higher. Maybe related to https://github.com/DLR-AMR/t8code/pull/1996."); | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only the case when MPI is not linked or also when the number of ranks is 1?
| * that concentrates all trees at one process. */ | ||
| t8_shmem_init (sc_MPI_COMM_WORLD); | ||
| const int intranode_size = t8_shmem_init (sc_MPI_COMM_WORLD); | ||
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; | |
| ASSERT_NE (intranode_size, 0) << "Could not initialize shared memory."; |
| * that concentrates all trees at one process. */ | ||
| t8_shmem_init (comm); | ||
| const int intranode_size = t8_shmem_init (comm); | ||
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; | |
| ASSERT_NE (intranode_size, 0) << "Could not initialize shared memory."; |
| /* setup shared memory usage */ | ||
| t8_shmem_init (comm); | ||
| const int intrasize_from_init = t8_shmem_init (comm); | ||
| ASSERT_GT (intrasize_from_init, 0) << "Error in t8_shmem_init. No intranode communicator set."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ASSERT_GT (intrasize_from_init, 0) << "Error in t8_shmem_init. No intranode communicator set."; | |
| ASSERT_NE (intrasize_from_init, 0) << "Error in t8_shmem_init. No intranode communicator set."; |
| /* setup shared memory usage */ | ||
| t8_shmem_init (comm); | ||
| const int intranode_size = t8_shmem_init (comm); | ||
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; | |
| ASSERT_NE (intranode_size, 0) << "Could not initialize shared memory."; |
| /* setup shared memory usage */ | ||
| t8_shmem_init (comm); | ||
| const int intranode_size = t8_shmem_init (comm); | ||
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ASSERT_GT (intranode_size, 0) << "Could not initialize shared memory."; | |
| ASSERT_NE (intranode_size, 0) << "Could not initialize shared memory."; |
| if (forest->global_first_desc == NULL) { | ||
| /* Set the shmem array type of comm */ | ||
| t8_shmem_init (comm); | ||
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not partition forest."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not partition forest."); | |
| SC_CHECK_ABORT (t8_shmem_init (comm) != 0, "Error in shared memory setup. Could not partition forest."); |
| if (forest->tree_offsets == NULL) { | ||
| /* Set the shmem array type of comm */ | ||
| t8_shmem_init (comm); | ||
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not partition forest."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not partition forest."); | |
| SC_CHECK_ABORT (t8_shmem_init (comm) != 0, "Error in shared memory setup. Could not partition forest."); |
| T8_ASSERT (forest->element_offsets == NULL); | ||
| /* Set the shmem array type to comm */ | ||
| t8_shmem_init (comm); | ||
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not partition forest."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not partition forest."); | |
| SC_CHECK_ABORT (t8_shmem_init (comm) != 0, "Error in shared memory setup. Could not partition forest."); |
|
|
||
| /* Try to set the comm type */ | ||
| t8_shmem_init (comm); | ||
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not load cmesh."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup. Could not load cmesh."); | |
| SC_CHECK_ABORT (t8_shmem_init (comm) != 0, "Error in shared memory setup. Could not load cmesh."); |
| tree_offset = cmesh->first_tree_shared ? -cmesh->first_tree - 1 : cmesh->first_tree; | ||
| if (cmesh->tree_offsets == NULL) { | ||
| t8_shmem_init (comm); | ||
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SC_CHECK_ABORT (t8_shmem_init (comm) > 0, "Error in shared memory setup."); | |
| SC_CHECK_ABORT (t8_shmem_init (comm) != 0, "Error in shared memory setup."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the != 0 fits better the true/false version of C, e.g. true being not zero.
Describe your changes here:
Related to #1985
The shared memory was often not initialized properly but we did not detect it.
I added an error code to
t8_shmem_initfollowing the convention ofsc_shmem_init.This actually caused many of our tests to fail giving more reason to this feature.
The errors are fixed with #1996 which should be merged first.
When the error code is met, we currently abort the program.
The reasoning is that shared memory should always be available to us, since it is part of MPI >=3.0
(This creates the new issue of requiring MPI 3.0 in our build system).
All these boxes must be checked by the AUTHOR before requesting review:
Documentation:,Bugfix:,Feature:,Improvement:orOther:.All these boxes must be checked by the REVIEWERS before merging the pull request:
As a reviewer please read through all the code lines and make sure that the code is fully understood, bug free, well-documented and well-structured.
General
Tests
If the Pull request introduces code that is not covered by the github action (for example coupling with a new library):
Scripts and Wiki
script/find_all_source_files.scpto check the indentation of these files.License
doc/(or already has one).