Skip to content

LUCA.hi5 file not detected, downloads taxdump? #46

@desmodus1984

Description

@desmodus1984

Hi,
I extracted full-length protein data from some genomes that I want to check gene reportoire before doing gene prediction.
All the proteins are in proteins.fasta file inside folder of their respective species name like ZalCal.
I created a loop to go to the folder and do the gene repertoire check, and for the LUCA.hi5 file I used the full path location.
Here is the code:

main_subdirectories=$(find . -type f -name "proteins.fasta" -print0 | xargs -0 -n1 dirname | sort -u)

# Check if any subdirectories were found
if [ -z "$main_subdirectories" ]; then
  echo "No 'proteins.fasta' files found in any subdirectories."
  exit 0
fi

echo "Main subdirectories containing 'proteins.fasta' files:"

# Loop through each unique main subdirectory
for dir in $main_subdirectories; do
        base=$(basename $dir)

        omamer search --db  /data/common/juanpablo.aguilar/data/LUCA.h5 --query ${base}/proteins.fasta --out Omamer/${base}.omamer
        omark -f Omamer/${base}.omamer -d /data/common/juanpablo.aguilar/data/LUCA.h5 -o results/${base}

done

I ran the code and it is saying that it is downloading the taxdump file?

usage: omamer search [-h] -d DB -q QUERY [--threshold THRESHOLD]
[--family_alpha FAMILY_ALPHA] [-fo] [-n TOP_N_FAMS]
[--reference_taxon REFERENCE_TAXON] [-o OUT]
[--include_extant_genes] [-c CHUNKSIZE]
[-t {0,1,2,3,4,5,6,7,8}]
[--log_level {debug,info,warning}] [--silent]
omamer search: error: argument -o/--out: can't open 'Omamer/CalUrs.omamer': [Errno 2] No such file or directory: 'Omamer/CalUrs.omamer'
WARNING: Matplotlib is building the font cache; this may take a moment.
ERROR: The path to the OMAMer file is not valid.
WARNING: Database version mismatch: DB 2.0.3 / OMAmer 2.1.0
NCBI database not present yet (first time used?)
Downloading taxdump.tar.gz from NCBI FTP site (via HTTP)...

Done. Parsing...

I downloaded the LUCA.hi5 from https://omabrowser.org/oma/current/

Any reason why it is not using that file and downloading the taxdump?
I did download the LUCA.hi5 file and then just followed the example to set up my code just changing the paths of my files.

According to the website, it says in the Omark - arguments : "-o --outputFolder ./omark_output/ Path to the folder into which OMArk results will be output. OMArk will create it if it does not exist."

When I check the error file of the job, I got this:

usage: omamer search [-h] -d DB -q QUERY [--threshold THRESHOLD]
[--family_alpha FAMILY_ALPHA] [-fo] [-n TOP_N_FAMS]
[--reference_taxon REFERENCE_TAXON] [-o OUT]
[--include_extant_genes] [-c CHUNKSIZE]
[-t {0,1,2,3,4,5,6,7,8}]
[--log_level {debug,info,warning}] [--silent]
omamer search: error: argument -o/--out: can't open 'Omamer/CalUrs.omamer': [Errno 2] No such file or directory: 'Omamer/CalUrs.omamer'
WARNING: Matplotlib is building the font cache; this may take a moment.
ERROR: The path to the OMAMer file is not valid.
WARNING: Database version mismatch: DB 2.0.3 / OMAmer 2.1.0
NCBI database not present yet (first time used?)
Downloading taxdump.tar.gz from NCBI FTP site (via HTTP)...
Done. Parsing...

Inserting synonyms: 0
Inserting synonyms: 5000
Inserting synonyms: 10000
Inserting synonyms: 15000
Inserting synonyms: 20000
Inserting synonyms: 25000
Inserting synonyms: 30000
Inserting synonyms: 35000
Inserting synonyms: 40000
Inserting synonyms: 45000
Inserting synonyms: 50000
Inserting synonyms: 55000
Inserting synonyms: 60000
Inserting synonyms: 65000
Inserting synonyms: 70000
Inserting synonyms: 75000
Inserting synonyms: 80000
Inserting synonyms: 85000
Inserting synonyms: 90000
Inserting synonyms: 95000
Inserting synonyms: 100000
Inserting synonyms: 105000
Inserting synonyms: 110000
Inserting synonyms: 115000
Inserting synonyms: 120000
Inserting synonyms: 125000
Inserting synonyms: 130000
Inserting synonyms: 135000
Inserting synonyms: 140000
Inserting synonyms: 145000
Inserting synonyms: 150000
Inserting synonyms: 155000
Inserting synonyms: 160000
Inserting synonyms: 165000
Inserting synonyms: 170000
Inserting synonyms: 175000
Inserting synonyms: 180000
Inserting synonyms: 185000
Inserting synonyms: 190000
Inserting synonyms: 195000
Inserting synonyms: 200000
Inserting synonyms: 205000
Inserting synonyms: 210000
Inserting synonyms: 215000
Inserting synonyms: 220000
Inserting synonyms: 225000
Inserting synonyms: 230000
Inserting synonyms: 235000
Inserting synonyms: 240000
Inserting synonyms: 245000
Inserting synonyms: 250000
Inserting synonyms: 255000
Inserting synonyms: 260000
Inserting synonyms: 265000
Inserting synonyms: 270000
Inserting synonyms: 275000
Inserting synonyms: 280000
Inserting synonyms: 285000
Inserting synonyms: 290000
Inserting synonyms: 295000
Inserting synonyms: 300000
Inserting synonyms: 305000
Inserting synonyms: 310000
Inserting synonyms: 315000
Inserting synonyms: 320000
Inserting synonyms: 325000
Inserting synonyms: 330000
Inserting synonyms: 335000
Inserting synonyms: 340000
Inserting synonyms: 345000
Inserting synonyms: 350000
Inserting synonyms: 355000
Inserting synonyms: 360000
Inserting synonyms: 365000
Inserting synonyms: 370000
Inserting synonyms: 375000
Inserting synonyms: 380000
Inserting synonyms: 385000
Inserting synonyms: 390000
Inserting synonyms: 395000
Inserting synonyms: 400000
Inserting synonyms: 405000
Inserting synonyms: 410000
Inserting synonyms: 415000
Inserting taxid merges: 0
Inserting taxid merges: 5000
Inserting taxid merges: 10000
Inserting taxid merges: 15000
Inserting taxid merges: 20000
Inserting taxid merges: 25000
Inserting taxid merges: 30000
Inserting taxid merges: 35000
Inserting taxid merges: 40000
Inserting taxid merges: 45000
Inserting taxid merges: 50000
Inserting taxid merges: 55000
Inserting taxid merges: 60000
Inserting taxid merges: 65000
Inserting taxid merges: 70000
Inserting taxid merges: 75000
Inserting taxid merges: 80000
Inserting taxid merges: 85000
Inserting taxid merges: 90000
Inserting taxids: 0
Inserting taxids: 5000
Inserting taxids: 10000
Inserting taxids: 15000
Inserting taxids: 20000
Inserting taxids: 25000
Inserting taxids: 30000
Inserting taxids: 35000
Inserting taxids: 40000
Inserting taxids: 45000
Inserting taxids: 50000
...
...
...
...
Inserting taxids: 2650000
Inserting taxids: 2655000
Inserting taxids: 2660000
Inserting taxids: 2665000
Inserting taxids: 2670000
Inserting taxids: 2675000
Inserting taxids: 2680000
Inserting taxids: 2685000 ERROR: The path to the output directory is not valid (Its parent directory does not exist).
ERROR: Exiting because one or more parameters are incorrect
usage: omamer search [-h] -d DB -q QUERY [--threshold THRESHOLD]
[--family_alpha FAMILY_ALPHA] [-fo] [-n TOP_N_FAMS]
[--reference_taxon REFERENCE_TAXON] [-o OUT]
[--include_extant_genes] [-c CHUNKSIZE]
[-t {0,1,2,3,4,5,6,7,8}]
[--log_level {debug,info,warning}] [--silent]
omamer search: error: argument -o/--out: can't open 'Omamer/EumJub.omamer': [Errno 2] No such file or directory: 'Omamer/EumJub.omamer'
ERROR: The path to the OMAMer file is not valid.
WARNING: Database version mismatch: DB 2.0.3 / OMAmer 2.1.0
ERROR: The path to the output directory is not valid (Its parent directory does not exist).
ERROR: Exiting because one or more parameters are incorrect
usage: omamer search [-h] -d DB -q QUERY [--threshold THRESHOLD]
[--family_alpha FAMILY_ALPHA] [-fo] [-n TOP_N_FAMS]
[--reference_taxon REFERENCE_TAXON] [-o OUT]
[--include_extant_genes] [-c CHUNKSIZE]
[-t {0,1,2,3,4,5,6,7,8}]
[--log_level {debug,info,warning}] [--silent]
omamer search: error: argument -o/--out: can't open 'Omamer/Hgry.omamer': [Errno 2] No such file or directory: 'Omamer/Hgry.omamer'
ERROR: The path to the OMAMer file is not valid.
WARNING: Database version mismatch: DB 2.0.3 / OMAmer 2.1.0
ERROR: The path to the output directory is not valid (Its parent directory does not exist).
ERROR: Exiting because one or more parameters are incorrect
usage: omamer search [-h] -d DB -q QUERY [--threshold THRESHOLD]
[--family_alpha FAMILY_ALPHA] [-fo] [-n TOP_N_FAMS]
[--reference_taxon REFERENCE_TAXON] [-o OUT]
[--include_extant_genes] [-c CHUNKSIZE]
[-t {0,1,2,3,4,5,6,7,8}]
[--log_level {debug,info,warning}] [--silent]
omamer search: error: argument -o/--out: can't open 'Omamer/LepWed.omamer': [Errno 2] No such file or directory: 'Omamer/LepWed.omamer'
ERROR: The path to the OMAMer file is not valid.
WARNING: Database version mismatch: DB 2.0.3 / OMAmer 2.1.0
ERROR: The path to the output directory is not valid (Its parent directory does not exist).
ERROR: Exiting because one or more parameters are incorrect
usage: omamer search [-h] -d DB -q QUERY [--threshold THRESHOLD]
[--family_alpha FAMILY_ALPHA] [-fo] [-n TOP_N_FAMS]
[--reference_taxon REFERENCE_TAXON] [-o OUT]
[--include_extant_genes] [-c CHUNKSIZE]
[-t {0,1,2,3,4,5,6,7,8}]
[--log_level {debug,info,warning}] [--silent]
omamer search: error: argument -o/--out: can't open 'Omamer/Mleo.omamer': [Errno 2] No such file or directory: 'Omamer/Mleo.omamer'
ERROR: The path to the OMAMer file is not valid.
WARNING: Database version mismatch: DB 2.0.3 / OMAmer 2.1.0
ERROR: The path to the output directory is not valid (Its parent directory does not exist).
ERROR: Exiting because one or more parameters are incorrect

Isn't omamer search supposed to create the output directory?
I followed the code of the example.

I have access to an HPC, can I multithread it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions