Skip to content

b2c.destripe() worsens row/column-related technical effect #45

@Nick-Eagles

Description

@Nick-Eagles

Hello,

Thanks for developing this extraordinarily useful software. In some Visium HD data I'm working with, I've also observed the technical effect where certain array rows or columns have smaller/larger total raw counts than others. However, I've found that b2c.destripe() actually exaggerates the problem. The data is private, so while I can't provide a reproducible example, I'll provide the chunk of relevant code, then show screenshots of the resulting plots, where the only difference between the two is whether b2c.destripe(adata_filtered) was run. I read in the raw data from Spaceranger, apply the recommended filtering steps seen in the tutorial, apply b2c.destripe in one case, then add up total UMI counts for each bin. For display, I cap totals at 5 UMI, which helps the color range in the plots.

adata_filtered = b2c.read_visium(
    sr_dir,
    count_file = 'filtered_feature_bc_matrix.h5',
    source_image_path = raw_image_path,
    spaceranger_image_path = sr_spatial_dir
)

b2c.scaled_he_image(
    adata_filtered,
    mpp = mpp,
    save_path = 'temp.tiff'
)

#   Apply recommended filtering steps following tutorial
sc.pp.filter_cells(adata_filtered, min_counts=1)
sc.pp.filter_genes(adata_filtered, min_cells=3)

#   In one of the two cases, this is run
b2c.destripe(adata_filtered)

#   Add up counts for all genes in each bin. Set a hard maximum threshold to
#   eliminate outliers that throw off the color scale in plots
adata_filtered.obs['sum_umi'] = adata_filtered.X.sum(axis = 1)
adata_filtered.obs['capped_umi'] = np.clip(adata_filtered.obs['sum_umi'], 0, 5)

#   Take a small subset in the top-right corner of the data
scale_factor = 0.3
mask = ( 
    (adata_filtered.obs['array_row'] >= (1 - scale_factor) * adata_filtered.obs['array_row'].max()) & 
    (adata_filtered.obs['array_col'] >= (1 - scale_factor) * adata_filtered.obs['array_col'].max())
)
adata_small = adata_filtered[mask, :]

sc.pl.spatial(
    adata_small, color="capped_umi", img_key=f"{mpp}_mpp_150_buffer",
    basis="spatial_cropped_150_buffer"
)

This particular sample is quite low in total UMI in general (with a median around 2 counts per 2um bin after destriping). Is the high sparsity maybe throwing off the destripe algorithm?

Without destripe:
Image

With destripe:
Image

Best,
-Nick

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions