Skip to content

Memory for PepQuery2 #30

@mira-miracoli

Description

@mira-miracoli

Not 100% sure where to place this issue, but I thought it might be interesting for all users of the shared database. Otherwise I can of course move it to EU.

I am currently debugging an error the pepquery2 tool. The job errored because the JVM run out of memory.
When I tried to run the job locally I had to stop at 14G because my laptop (16G) started to lag.
I noticed that:

  • it uses 8 cores (from the logs) even though it was allocated 1 core
  • the index creation set is much slower on the server than on my laptop (might be storage related)
  • there is no rule for TPV (and was no rule for sortinghat)

I would like to change that, but I am not sure which values I should consider. In their documentation I found a recommendation for 8 GB of memory and 4 CPUs which is too little for at least the job I am looking at. When I tried to use gxadmin query tool-memory-per-inputs I found:

    id    |                                tool_id                                 | input_count | total_input_size_mb | mean_input_size_mb | median_input_size_mb | memory_used_mb | memory_used_per_input_mb | memory_mean_input_ratio | memory_median_input_ratio                                                                                                               
----------+------------------------------------------------------------------------+-------------+---------------------+--------------------+----------------------+----------------+--------------------------+-------------------------+--------------------------- 
 ######### | toolshed.g2.bx.psu.edu/repos/galaxyp/pepquery2/pepquery2/2.0.2+galaxy0 |           1 |                  36 |                 36 |                   36 |         283829 |            7948 |                    7948 |                      7948                           

While gxadmin report job-info returned the following:

## Destination Parameters                                                                                                                                                                    
                                                                                 
Key | Value                                                                                                                                                                                  
--- | ---
+Group | `""`
accounting_group_user | `#####`
description | `pepquery2`                                                                                                                                                                    
docker_memory | `3.8G`                                                    
metadata_strategy | `extended`                                                               
request_cpus | `1`                                                                                                                                                                           
request_memory | `3.8G`
requirements | `(GalaxyGroup  ==  "compute")`
submit_request_gpus | `0`    

I am now trying to figure out how to implement a rule here and if we have to change something in the wrapper because of the CPU usage. Since I never used the tool myself I would be happy about any hints from people who have some experience with it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions