feat: Add HTML representation #2236

katosh · 2025-11-29T20:01:57Z

Rich HTML representation for AnnData

Closes HTML Repr #675
Tests added
Release note added

Summary

Implements rich HTML representation (_repr_html_) for AnnData objects in Jupyter notebooks. Builds on previous draft PRs (#784, #694, #521, #346) with a complete, production-ready implementation.

Live Demo | Reviewer's Guide (technical details, design decisions, extensibility examples)

Screenshot

Features

Interactive Display

Foldable sections with auto-collapse for large datasets
Search/filter with regex and case-sensitive toggles
Copy-to-clipboard for field names
Nested AnnData expansion with configurable depth
.raw section showing unprocessed data (Report n_vars of .raw in __repr__ #349)

Visual Indicators

Category colors from uns palettes (e.g., cell_type_colors)
Type badges for views, backed mode, sparse matrices, Dask arrays
Serialization warnings for data that won't write to H5AD/Zarr
Value previews for simple uns values
README support via modal (renders markdown from uns["README"])
Memory info in footer

Serialization Warnings

Proactively warns about data that won't serialize:

Level	Issue	Related
🔴 Error	datetime64/timedelta64	#455, #2238
🔴 Error	Non-string keys	#321
🟡 Warning	Keys with `/`	#1447, #2099
🟡 Warning	Object columns with dicts/lists	#1923, #567, #636
🟡 Warning	String columns auto-converted to categorical	#534, #926

Compatibility

Dark mode auto-detection (Jupyter Lab/VS Code)
No-JS fallback with graceful degradation
JupyterLab safe - CSS scoped to .anndata-repr prevents style conflicts
Lazy-loading safe - configurable partial loading for read_lazy() (categories, colors)
Zero dependencies added

Extensibility

Three extension mechanisms for ecosystem packages (MuData, SpatialData, TreeData):

TypeFormatter - Custom visualization for value types
SectionFormatter - Add new sections (e.g., obst/vart, mod)
Building blocks - CSS/JS/helpers for packages needing full control

See the Reviewer's Guide for examples and API documentation.

Testing

303 unit tests with ~92% coverage
Visual test cases: python tests/visual_inspect_repr_html.py

Supersedes Draft for AnnData html repr #784, Initial draft of AnnData HTML repr #694, WIP: add _repr_html_() method to AnnData for nicer rendering in Jupyter #521, [draft] html repr #346 (previous drafts)
Compatible with feat: remove sparse data scipy inheritance #1927 (sparse scipy changes), feat: array-api compatibility #2063 (Array-API)
Fully backward compatible

Acknowledgments

Thanks to @selmanozleyen (#784), @gtca (#694), @VolkerH (#521), @ivirshup (#346, #675), and @Zethson (#675) for prior work and discussions.

Technical Notes and Edits

Lazy Loading

Constants are in _repr_constants.py (outside _repr/) to prevent loading ~6K lines on import anndata. The full module loads only when _repr_html_() is called.

Config Changes

pyproject.toml: Added vart to codespell ignore list (TreeData section name).

Edit (Dec 27, 2024)

To simplify review and reduce the diff, I've merged settylab/anndata#3 into this PR. That PR was originally created as a follow-up to explore additional features based on the discussion with @Zethson about SpatialData/MuData extensibility.

What changed:

Exported building blocks - CSS, JavaScript, and rendering helpers for external packages to build custom reprs while reusing anndata's styling
.raw section - Expandable row showing unprocessed data (Report n_vars of .raw in __repr__ #349)
Enhanced serialization warnings - Extended to cover datetime64, non-string keys, slashes in keys, and all sections
Regex search - Case-sensitive and regex toggles for filtering
Robust error handling - Failed sections show visible error indicators instead of being silently hidden

Edit (Jan 4, 2025)

Moved detailed implementation documentation (architecture, design decisions, extensibility examples, configuration reference) to the Reviewer's Guide to keep this PR description focused on features.

Code refactoring:

Split html.py into focused modules for maintainability
UI components extracted to components.py (badges, buttons, icons)
Section renderers moved to sections.py (obs/var, mapping, uns, raw)
Shared rendering primitives extracted to core.py (avoids circular imports)
Preview utilities moved to utils.py
FormatterContext consolidates all 6 rendering settings (read once at entry, propagated via context)
Result: html.py reduced from ~2100 to ~740 lines, clean import hierarchy

New features:

"Lazy" badge for read_lazy() AnnData objects (experimental) - indicates when obs/var are xarray-backed
Visual test for lazy AnnData (9b) - demonstrates lazy loading with (lazy) indicator on columns

Bug fixes:

Consistent meta column styling - all meta column text now uses adata-text-muted class for uniform appearance
Bytes index decoding - properly decode bytes values in index previews

Related issue discovered:

read_lazy() returns index values as byte-representation strings (e.g., "b'cell_0'" instead of "cell_0") - see ISSUE_READ_LAZY_INDEX.md

Edit (Jan 6, 2025)

Smart partial loading for read_lazy() AnnData:

Previously, lazy AnnData showed no category previews to avoid disk I/O. Now we do minimal, configurable loading to get richer visualization cheaply: only the first N category labels and their colors are read from storage (not the full column data). New setting repr_html_max_lazy_categories (default: 100, set to 0 for metadata-only mode).

Visual tests reorganized: 8 (Dask), 8b (lazy categories), 8c (metadata-only), 9 (backed).

…ther"

katosh · 2025-12-27T14:10:00Z

Hi @flying-sheep, @Zethson, @ivirshup! Hope you're having a wonderful holiday season!

Just a quick update: I've merged settylab/anndata#3 into this PR to keep everything in one place. That brought in the exported building blocks for packages like SpatialData/MuData, the .raw section, enhanced serialization warnings, and a few other improvements. I've updated the PR description above to reflect all these changes.

No rush at all with the holidays! Whenever you have a moment, I'd appreciate any feedback on the direction. Happy New Year!

https://htmlpreview.github.io/?https://gist.githubusercontent.com/katosh/4a2399d1472c733b041ef8dfd5b489b9/raw/repr_html_visual_test.html

katosh · 2026-01-06T01:39:54Z

I feel like lazy loading might become more common (especially as datasets and number of modalities grow larger). So, I decided to take a pragmatic approach for the HTML repr rather than showing no category information at all.

The trade-off: For read_lazy() AnnData, we now do minimal, configurable partial loading to get richer category previews:

Only the first N category labels are read from storage (not the full column data or codes)
Only the corresponding N colors from .uns are loaded
Controlled via ad.settings.repr_html_max_lazy_categories (default: 100, set to 0 for zero disk I/O)

Why not avoid all loading? Showing just "(50 categories)" is much less useful than seeing the actual category names with color swatches. The cost of reading a few category strings is small compared to the value of the preview.

Implementation: We access CategoricalArray._categories directly and use read_elem_partial() to read only what we need. This bypasses the @cached_property that would load all categories. See the design decision in the reviewer's guide for details.

Visual examples: See tests 8b (partial loading) and 8c (metadata-only mode) in the live demo.

katosh and others added 30 commits November 28, 2025 11:55

implement html representation

30a1e71

vizual inspection testing

5ce0afb

fix dark mode and nesting of htlm rep

9da45fe

handle disabled script in htlm rep

28292b9

more compact html rep

774c942

show categories in html rep

42ec6e6

dark mode and stability

73f0c5d

make max_cats configurable in html rep

292b4fc

test many cat and no JS for html rep

181b4d4

cnter folding icon in html rep

1dd4f18

max rows for counting n-unique in html rep

3db23cd

header coloring in html rep

d5974f6

max 20 categories in html rep

5cd1dd5

udpate many cats viz test of html rep

ef178c5

robust html rep for ad blocker

11949af

more tetsing of html rep

139f94d

future proof html rep

8a14312

htlm rep documentation

a64de45

show backed path inline in html rep

e7461f8

add custom uns rendering for html rep

966bb54

customizable section html rep

27b83f6

fix som html rep previews

bfe8221

better multi line categories in html rep

065eb2c

increase html rep testing

9505b63

formatt and style of html rep

8983df3

reduce complexity of html rep

bfe31eb

add "vart" to codespell's ignore-words-list

4b37eca

failed formatter wrnings in html

07632cf

explicit cleanup in html rep test

4c7ab7c

html rep aesthetics and formatting

c65952c

katosh added 12 commits December 15, 2025 21:35

multi section SectionFormatter & do not list fomratted sections in "o…

2fbd1db

…ther"

document "Building Custom _repr_html_"

55c6f50

test html rep public API

2fc88e8

test serializability for columns of .obs and .var

9bd4ae0

test serializability of column names and keys

d47fd5b

also warn datetime non-serializability

181430c

maintainer info in case of test failure

61e49ee

futureproof more serializationw warnings

29b5caa

coding style: relative imports and no section headers

2597d99

move a comment

74bd1a5

Merge remote-tracking branch 'upstream/main' into expose_html_rep

2281875

currectly use new keyword-only signature

c46ebd6

katosh added 12 commits December 27, 2025 23:05

fix docstring errors

6f008f7

avoid loading _rep on anndata import

64eb50e

update non-loading of lazy data implementation and example

f6d7e92

remove some dead code forom html rep

b6f1fa5

reduce branches in render_formatted_entry

3a5f06a

reduce size of html.py

7017fec

core.py avoids necessity for most circular imports

fc7a66b

consolidate html rep sttings in FormatterContext

43a34e1

call nunique only once in html rep

aa4ff38

"is lazy" badge for html rep and example

2ed985e

see "9b. Lazy AnnData (Experimental)" in

73bb56c

https://htmlpreview.github.io/?https://gist.githubusercontent.com/katosh/4a2399d1472c733b041ef8dfd5b489b9/raw/repr_html_visual_test.html

toc in html rep vizual test

2ee5f12

flying-sheep changed the title ~~Add HTML representation~~ feat: Add HTML representation Jan 5, 2026

katosh added 2 commits January 5, 2026 17:46

issue wrong decoding in lazy example by using newer numpy

5465ee8

richer html rep for fully lazy data

a9164e2

cleanup html rep for lazy

8fe997d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add HTML representation #2236

feat: Add HTML representation #2236

katosh commented Nov 29, 2025 •

edited

Loading

Uh oh!

katosh commented Dec 27, 2025

Uh oh!

katosh commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add HTML representation #2236

Are you sure you want to change the base?

feat: Add HTML representation #2236

Conversation

katosh commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rich HTML representation for AnnData

Summary

Screenshot

Features

Interactive Display

Visual Indicators

Serialization Warnings

Compatibility

Extensibility

Testing

Related

Acknowledgments

Lazy Loading

Config Changes

Edit (Dec 27, 2024)

Edit (Jan 4, 2025)

Edit (Jan 6, 2025)

Uh oh!

katosh commented Dec 27, 2025

Uh oh!

katosh commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

katosh commented Nov 29, 2025 •

edited

Loading