Skip to content

Missing the CCL3L1 gene from TCGA RNAseq raw counts #2

@dtiezzi

Description

@dtiezzi

I'm working with TCGA RNAseq data (aligned to hg38) and realized there is no annotation for CCL3L1 gene in raw counts data from GDC repository.
The corresponding ensembl id for CCL3L1 is ENSG00000277796, which is not available in TCGA data. However, I can find the ENSG00000276085 gene, which correspond to CCL3L3 gene.
Looking at Ensemble website I realized the CCL3L1 and CCL3L3 correspond to similar location at the genome (http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000277796;r=CHR_HSCHR17_10_CTG4:36194906-36196795;t=ENST00000612067 and http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000276085;r=17:36194869-36196758).
My question is, may I use the CCL3L3 count number as the CCL3L1 gene counts to calculate the IPS?

Additionally, the biomaRt package for R and the annotated gtf file from TCGA (https://api.gdc.cancer.gov/data/fe1750e4-fc2d-4a2c-ba21-5fc969a24f27) use the CAVIN2 as the alias for SDPR gene. So, this is a source of error running IPS with TCGA hg38 data.

Looking forward to hear from you

Cheers!

daniel

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions