Skip to content

Unexpected Low % Associated Query Proteins within the Main Species/Clade #44

@glmaala

Description

@glmaala

I'm using the software to assess protein sets of multiple molluscan species.

Correct me if I'm wrong, but I've inferred from the values obtained that the total % Associated query proteins within the main species/clade (under "Species Composition") is equal to the sum of the % of Total Consistent and % Total Inconsistent lineage placements (under "Consistency Assessment").

I have one protein set that gets the following results:

CONSISTENCY ASSESSMENT

Number of proteins in the whole proteome: 46469

#Consistent lineage placements
Total Consistent: 21176 (45.57%)
Consistent, partial hits: 9008 (19.38%)
Consistent, fragmented: 2953 (6.35%)

#Inconsistent lineage placements
Total Inconsistent: 4310 (9.28%)
Inconsistent, partial hits: 2461 (5.30%)
Inconsistent, fragmented: 645 (1.39%)

#Contaminants
Total Contaminants: 3561 (7.66%)
Contaminants, partial hits: 2629 (5.66%)
Contaminants, fragmented: 269 (0.58%)

#Unknown
Total Unknown: 17422 (37.49%)

SPECIES COMPOSITION

##Detected species

#Main species
Clade: Mollusca
Number of associated query proteins: 3989 (8.58%)

#Potential Contaminants

#Potential contaminant Nº1
Clade: Cercopithecinae
Number of associated query proteins: 3652 (7.86%)

I was expecting the associated query proteins to be at 50% or so, and I don't understand where the 8.58% came from, with only 3,989 protein seqs in the Mollusca.fasta file output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions