Skip to content

brca_tcga_pan_can_atlas_2018 is failing with cBioDataPack #68

@mjsteinbaugh

Description

@mjsteinbaugh

Hi, I'm seeing a parsing error for brca_tcga_pan_can_atlas_2018:

utils::read.table chokes too easily on malformed files -- is it worth considering switching to readr/vroom or data.table here to harden against malformed files in the tarballs?

> packageVersion("cBioPortalData")
[1] ‘2.12.0’
> brca <- cBioPortalData::cBioDataPack("brca_tcga_pan_can_atlas_2018", ask = FALSE)
Warning: replacing previous import ‘utils::findMatches’ by ‘S4Vectors::findMatches’ when loading ‘AnnotationDbi’
Warning in .service_validate_md5sum(api_reference_url, api_reference_md5sum,  :
  service version differs from validated version
    service url: https://www.cbioportal.org/api/v2/api-docs
    observed md5sum: 008be96361f24a5c8d1cfb7f10ae9c97
    expected md5sum: 07ceb76cc5afcf54a9cf2e1a689b18f7
Calls: <Anonymous> ... initialize -> initialize -> Service -> .service_validate_md5sum
Downloading study file: brca_tcga_pan_can_atlas_2018.tar.gz
  |======================================================================| 100%

Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_armlevel_cna.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_cna_hg19.seg
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_cna.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_gene_panel_matrix.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_log2_cna.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_methylation_hm27_hm450_merged.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_microbiome.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_mrna_seq_v2_rsem_zscores_ref_all_samples.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_mrna_seq_v2_rsem_zscores_ref_diploid_samples.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_mrna_seq_v2_rsem_zscores_ref_normal_samples.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_mrna_seq_v2_rsem.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_mutations.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_phosphoprotein_quantification.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_protein_quantification_zscores.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_protein_quantification.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_rppa_zscores.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_rppa.txt
Working on: /var/folders/l1/8y8sjzmn15v49jgrqglghcfr0000gn/T//RtmpYkgaMb/233d49947a0b_brca_tcga_pan_can_atlas_2018/brca_tcga_pan_can_atlas_2018/data_sv.txt
Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
  more columns than column names
Calls: <Anonymous> ... <Anonymous> -> .preprocess_data -> <Anonymous> -> read.table
Backtrace:
    ▆
 1. └─cBioPortalData::cBioDataPack(...)
 2.   └─cBioPortalData::loadStudy(exdir, names.field, cleanup)
 3.     └─cBioPortalData:::.loadExperimentsFromFiles(...)
 4.       └─base::Map(...)
 5.         └─base::mapply(FUN = f, ..., SIMPLIFY = FALSE)
 6.           └─cBioPortalData (local) `<fn>`(y = dots[[1L]][[18L]], x = dots[[2L]][[18L]])
 7.             └─cBioPortalData:::.preprocess_data(...)
 8.               └─utils::read.delim(...)
 9.                 └─utils::read.table(...)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions