Skip to content

Scale cores by memory #60

@natefoo

Description

@natefoo

For some tools like Kraken2 I'd like to define the core count largely in terms of memory, but I also need to use different core counts depending on the destination. Most of the HPC systems I run on allocate memory strictly by core, so you get either 2GB/core or 4GB/core. If I set a static value for cores, I'm wasting cores.

And then occasionally I get allocated full nodes (either if the required memory is > 128 GB or on one particular HPC system that only allocates full nodes), in this case I may have more cores available than I actually want to use due to the slowdowns or increased memory consumption you get with excessively high core counts.

This results in things like my kraken2 rule, the corresponding native spec for the 2GB/core system:

        submit_native_specification: "--partition=RM-shared --time={time} --nodes=1 --ntasks={int(mem/2)} --mem={int(mem/2)*2000}"

where I also override $GALAXY_SLOTS:

    env:
    - name: GALAXY_SLOTS
      value: "{min(int(mem/2), 64)}"

In general for any large memory tool, static core counts are fairly useless, and I may just prefer to always scale cores off of memory in the destination. I'm curious what others think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions