Do parallel runs always need N+1 tasks?

If I want to run a code (in my case POP from OMUSE) on N cores (number_of_workers), say 128, it seems like I need to request N+1, so 129, tasks (in slurm), because AMUSE will call MPI.Spawn 128 times, but the original process already used 1 task, so that makes the total 129.

- I can not use 127 workers for the code I run through AMUSE, because that will mess up the domain partitioning.
- If I use 129 tasks, it means that I have to use two supercomputer nodes (each node having 128 cores) so that's also not an option.
- If I use oversubscribe, I can not pin each process to a physical core, so that's probably bad for performance (this could double the computational time because two processes share a core).

What's the proper way to solve this? Is it really not possible to reuse the original process for a worker?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do parallel runs always need N+1 tasks? #945

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Do parallel runs always need N+1 tasks? #945

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions