Skip to content

limix_converter problem1 & dataset.getPhenotypes problem2 #22

@JonLJ

Description

@JonLJ

Hi

PROBLEM1
I am having a problem following the 'loading files into LIMIX' tutorial, specifically with using limix_converter to convert my phenotype file 'phenotypes.csv into hdf5 format.

If I use:

limix_converter -O ./my_file.hdf5 -C ./phenotypes.csv

I obtain something like that:

/home/jon/anaconda2/lib/python2.7/site-packages/limix/io/conversion.py:78: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support sep=None with delim_whitespace=False; you can avoid this warning by specifying engine='python'.
  C = pandas.io.parsers.read_csv(csv_file,sep=sep,header=None,index_col=False,*args,**kw_args)

On the other hand, if I use:

limix_converter -O ./my_file.hdf5 -C ./phenotypes.csv -D ,

in order to avoid this warning, I obtain:

......,21478,21479,21480,21481,21482,21483,21484,21485,21486,21487,21488,21489,21490,21491,21492,21493,21494,21495,21496,21497,21498,21499,21500,21501,21502,21503,21504,21505,21506,21507,21508,21509,21510,21511,21512,21513,21514,21515,21516,21517,21518,21519,21520,21521,21522,21523,21524,21525,21526,21527,21528,21529,21530,21531,21532,21533,21534,21535,21536,21537,21538,21539,21540,21541,21542,21543,21544,21545,21546,21547,21548,21549,21550,21551,21552,21553,21554,21555,21556,21557,21558,21559,21560,21561,21562,21563,21564,21565,21566,21567,21568,21569,21570,21571,21572,21573,21574,21575,21576,21577,21578,21579,21580,21581,21582,21583,21584,21585,21586,21587,21588,21589,21590,21591,21592,21593,21594,21595,21596,21597,21598,21599,21600,21601,21602,21603,21604,21605,21606,21607,21608,21609,21610,21611,21612,21613,21614,21615,21616,21617,21618,21619,21620,21621,21622,21623,21624,21625,21626,21627,21628,21629,21630,21631,21632,21633,21634,21635,21636,21637,21638,21639,21640,21641,21642,21643,21644,21645,21646,21647,21648,21649,21650,21651,21652,21653,21654,21655,21656,21657,21658,21659,21660,21661,21662,21663,21664,21665,21666,21667,21668,21669,21670,21671,21672,21673,21674,21675,21676,21677,21678,21679,21680,21681,21682,21683,21684,21685,21686,21687,21688,21689,21690,21691,21692,21693,21694) have mixed types. Specify dtype option on import or set low_memory=False.

In both cases, it seems to convert to hdf5 correctly. However, I do not if this last warning is due to the massive table of phenotypes I am using (I have around 21000 genes and 184 samples).

PROBLEM2
Anyway, once I made the conversion of the file, I do:

geno_reader  = gr.genotype_reader_tables('my_file.hdf5')
pheno_reader = phr.pheno_reader_tables('my_file.hdf5')
dataset = data.QTLData(geno_reader=geno_reader,pheno_reader=pheno_reader)

If I look at:

pheno_reader.pheno_matrix
pheno_reader.sample_ID
pheno_reader.phenotype_ID

Everything seems to be OK. However, when I do:

phenotypes,sample_idx=pheno_reader.getPhenotypes()

I have the following warning:

/home/jon/anaconda2/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/home/jon/anaconda2/lib/python2.7/site-packages/numpy/core/_methods.py:82: RuntimeWarning: Degrees of freedom <= 0 for slice
  warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)

And 'phenotypes' is an empty dataframe, and only the column names are defined ([0 rows x 21694 columns]). I do not understand why, but it also happens to me when I use your sample file of phenotypes.

Many thanks,
Jon

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions