-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Dear bin2cell team,
thank you very much for creating such a useful tool. I have a simple question after running a simplified version of the demo notebook.
After running b2c.bin_to_cell() on adata and creating cdata, I have noticed that all the bin information is lost in the process and the object_id is the new (and only) cell identifier. Below is a snippet from the console output:
>>> adata
AnnData object with n_obs × n_vars = 6132629 × 18823
obs: 'in_tissue', 'array_row', 'array_col', 'n_counts', 'destripe_factor', 'n_counts_adjusted', 'labels_he', 'labels_expanded'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells'
uns: 'spatial', 'bin2cell'
obsm: 'spatial', 'spatial_cropped_150_buffer'
>>> cdata = b2c.bin_to_cell(adata, labels_key="labels_he", spatial_keys=["spatial", "spatial_cropped_150_buffer"])
>>> cdata
AnnData object with n_obs × n_vars = 61842 × 18823
obs: 'object_id', 'bin_count', 'array_row', 'array_col'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells'
uns: 'spatial'
obsm: 'spatial', 'spatial_cropped_150_buffer'
I have several questions with respect to how b2c.bin_to_cell() is intended to work, and that may help improve the troubleshooting and results interpretation on the downstream analyses:
- The aggregation of counts is done based on the 'n_counts' or on the 'n_counts_adjusted'? Wouldn't it be good to keep both?
- Maybe I'm missing something here, but is the information with respect to the number of bins (and counts) that are aggregated into a single cell stored anywhere on
cdata? I understand that this is lost in the agreggation. I would suggest to include, at least, the number of bins that were aggregated into a single cell and the mean number of transcripts that they contained. - In a similar direction, what happens with the bins that are not assigned to any cell? Is that information removed from
cdata? I understand it may not be something to store oncdata(and even that it may be out of the scope of this function use), but it would be very useful to be able to profile how noisy the assignation is with respect to the gene expression of those unassigned bins. I am not a developer, but maybe the expression of the extracellular matrix could be used to compute some sort of probability/ies for expression purity in thebin_to_cellcalling.
Thank you in advance, looking forward to use your tool in my own samples!
Sergio
Metadata
Metadata
Assignees
Labels
No labels