-
Notifications
You must be signed in to change notification settings - Fork 110
Closed
Labels
kind: questionThe issue asks a questionThe issue asks a question
Description
If I want to run a code (in my case POP from OMUSE) on N cores (number_of_workers), say 128, it seems like I need to request N+1, so 129, tasks (in slurm), because AMUSE will call MPI.Spawn 128 times, but the original process already used 1 task, so that makes the total 129.
- I can not use 127 workers for the code I run through AMUSE, because that will mess up the domain partitioning.
- If I use 129 tasks, it means that I have to use two supercomputer nodes (each node having 128 cores) so that's also not an option.
- If I use oversubscribe, I can not pin each process to a physical core, so that's probably bad for performance (this could double the computational time because two processes share a core).
What's the proper way to solve this? Is it really not possible to reuse the original process for a worker?
Metadata
Metadata
Assignees
Labels
kind: questionThe issue asks a questionThe issue asks a question