Skip to content

Commit 493f390

Browse files
committed
feat: Add filesystem-only caching with global refresh support
- Implement CacheManager class for code execution result caching - Support hash-based caching (automatic invalidation on code changes) - Support custom cache IDs for cross-build persistence - Add MARKDOWN_EXEC_CACHE_REFRESH environment variable for global refresh - Add refresh option to force cache updates per code block - Store cache files in .markdown-exec-cache/ directory - Add comprehensive test suite with 14 cache-specific tests - Add caching documentation with usage examples - Update README with caching quickstart - Isolate test cache directories to prevent pollution This improves documentation build performance by caching expensive operations like plot generation, API calls, and computations.
1 parent 5606eda commit 493f390

File tree

10 files changed

+1005
-14
lines changed

10 files changed

+1005
-14
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,4 @@ uv.lock
2323
.mypy_cache/
2424
.ruff_cache/
2525
__pycache__/
26+
.markdown-exec-cache/

README.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,41 @@ grep extra_css README.md && exit 2
111111
```
112112
````
113113

114+
### Caching
115+
116+
Speed up your builds by caching execution results:
117+
118+
````md
119+
```python exec="yes" cache="yes"
120+
# Expensive computation
121+
import time
122+
time.sleep(5)
123+
print("Done!")
124+
```
125+
````
126+
127+
Use custom cache IDs for persistence across builds:
128+
129+
````md
130+
```python exec="yes" cache="my-plot"
131+
# Generate plot - will be cached
132+
import matplotlib.pyplot as plt
133+
# ...
134+
```
135+
````
136+
137+
Force cache refresh with `refresh="yes"`:
138+
139+
````md
140+
```python exec="yes" cache="my-plot" refresh="yes"
141+
# This will always re-execute
142+
```
143+
````
144+
145+
See [caching documentation](https://pawamoy.github.io/markdown-exec/usage/caching/) for more details.
146+
147+
---
148+
114149
See [usage](https://pawamoy.github.io/markdown-exec/usage/) for more details,
115150
and the [gallery](https://pawamoy.github.io/markdown-exec/gallery/) for more examples!
116151

docs/usage/caching.md

Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
# Caching
2+
3+
Markdown Exec supports filesystem-based caching of code execution results to speed up documentation builds and development workflows.
4+
5+
## Overview
6+
7+
When generating images, charts, or running expensive computations in your documentation, re-executing the same code on every build can significantly slow down the rendering process. The caching feature allows you to:
8+
9+
- **Speed up builds**: Reuse previously computed results instead of re-executing code
10+
- **Persist across builds**: All cache is stored on the filesystem for cross-build persistence
11+
- **Global cache refresh**: Force refresh of all cached results with a single environment variable
12+
13+
## Cache Storage
14+
15+
All cached results are stored in `.markdown-exec-cache/` in your project root directory:
16+
17+
```
18+
your-project/
19+
├── docs/
20+
├── mkdocs.yml
21+
└── .markdown-exec-cache/
22+
├── my-plot.cache # Custom ID cache
23+
└── abc123def456.cache # Hash-based cache files
24+
```
25+
26+
Add this directory to your `.gitignore`:
27+
28+
```gitignore
29+
.markdown-exec-cache/
30+
```
31+
32+
## Usage
33+
34+
### Hash-Based Caching
35+
36+
Enable caching by adding `cache="yes"` to your code block. A hash is computed from the code content and execution options:
37+
38+
````md exec="1" source="tabbed-left" tabs="Markdown|Rendered"
39+
```python exec="yes" cache="yes"
40+
import time
41+
print(f"Executed at: {time.time()}")
42+
```
43+
````
44+
45+
The cache is automatically invalidated when the code or execution options change.
46+
47+
### Custom Cache IDs
48+
49+
For more control, use a custom cache ID (string value). This is useful for expensive operations where you want explicit control over cache invalidation:
50+
51+
````md exec="1" source="tabbed-left" tabs="Markdown|Rendered"
52+
```python exec="yes" cache="my-plot"
53+
import matplotlib.pyplot as plt
54+
# Expensive plot generation...
55+
print("Generated plot")
56+
```
57+
````
58+
59+
The cache file will be stored as `.markdown-exec-cache/my-plot.cache`.
60+
61+
### Cache Invalidation
62+
63+
To force re-execution and update the cache for a specific code block, use `refresh="yes"`:
64+
65+
```markdown
66+
```python exec="yes" cache="my-plot" refresh="yes"
67+
# This will always re-execute and update the cache
68+
print("Fresh execution!")
69+
```
70+
```
71+
72+
!!! note "refresh vs removing cache"
73+
**`refresh="yes"`** forces re-execution but **keeps the cache enabled** - it updates the cached result for future builds.
74+
75+
**Removing `cache` option** completely disables caching - the code executes every time with no caching at all.
76+
77+
Use `refresh="yes"` when you want to update stale cache but keep caching benefits for subsequent builds.
78+
79+
### Global Cache Refresh
80+
81+
To refresh **all** cached results at once, set the `MARKDOWN_EXEC_CACHE_REFRESH` environment variable:
82+
83+
```bash
84+
# Force refresh all caches during build
85+
MARKDOWN_EXEC_CACHE_REFRESH=1 mkdocs build
86+
87+
# Or with other truthy values
88+
MARKDOWN_EXEC_CACHE_REFRESH=yes mkdocs build
89+
MARKDOWN_EXEC_CACHE_REFRESH=true mkdocs build
90+
MARKDOWN_EXEC_CACHE_REFRESH=on mkdocs build
91+
```
92+
93+
This is useful for:
94+
- CI/CD pipelines where you want fresh builds
95+
- Ensuring all documentation is up-to-date
96+
- Debugging cache-related issues
97+
98+
## Clearing Cache
99+
100+
### Delete Specific Cache Entry
101+
102+
Remove the cache file for a specific custom ID:
103+
104+
```bash
105+
rm .markdown-exec-cache/my-custom-id.cache
106+
```
107+
108+
### Clear All Cache
109+
110+
Remove the entire cache directory:
111+
112+
```bash
113+
rm -rf .markdown-exec-cache/
114+
```
115+
116+
## How It Works
117+
118+
1. **Hash Computation**: For `cache="yes"`, a SHA-256 hash is computed from:
119+
- The code content
120+
- Execution options (language, HTML mode, working directory, etc.)
121+
122+
2. **Cache Lookup**: Before execution, the filesystem cache is checked for a matching entry
123+
124+
3. **Execution & Storage**: If no cached result is found:
125+
- Code is executed
126+
- Output is stored in the filesystem cache
127+
128+
4. **Cache Retrieval**: Cached output is used instead of re-executing the code
129+
130+
## Best Practices
131+
132+
### When to Use Caching
133+
134+
**Good use cases:**
135+
- Generating plots, diagrams, or images
136+
- Running expensive computations
137+
- Calling external APIs or services
138+
- Processing large datasets
139+
140+
**Avoid caching for:**
141+
- Simple print statements
142+
- Code demonstrating output variations
143+
- Time-sensitive or non-deterministic code
144+
145+
### Choosing Cache Type
146+
147+
- **`cache="yes"`** (hash-based):
148+
- Automatically invalidated when code changes
149+
- Great for development and production
150+
- No manual cache management needed
151+
152+
- **`cache="custom-id"`** (custom ID):
153+
- Use for expensive operations where you want explicit control
154+
- Easier to identify and manage specific cache files
155+
- Requires manual invalidation or `refresh="yes"` when code changes
156+
157+
### Cache Invalidation Strategy
158+
159+
**For hash-based caching (`cache="yes"`):**
160+
- Cache is automatically invalidated when code or options change
161+
- No manual intervention needed
162+
163+
**For custom ID caching (`cache="custom-id"`):**
164+
165+
1. **Change the ID** when you want to force re-execution:
166+
```markdown
167+
cache="my-plot-v2" # Changed from my-plot
168+
```
169+
170+
2. **Use refresh temporarily**:
171+
```markdown
172+
cache="my-plot" refresh="yes" # Remove refresh="yes" after update
173+
```
174+
175+
3. **Use global refresh** for all caches:
176+
```bash
177+
MARKDOWN_EXEC_CACHE_REFRESH=1 mkdocs build
178+
```
179+
180+
4. **Clear cache directory** before important builds:
181+
```bash
182+
rm -rf .markdown-exec-cache/
183+
```
184+
185+
## Examples
186+
187+
### Caching a Matplotlib Plot
188+
189+
````markdown
190+
```python exec="yes" html="yes" cache="population-chart"
191+
import matplotlib.pyplot as plt
192+
import io
193+
import base64
194+
195+
# Expensive plot generation
196+
fig, ax = plt.subplots()
197+
ax.plot([1, 2, 3], [1, 4, 9])
198+
ax.set_title("Population Growth")
199+
200+
# Save to base64
201+
buffer = io.BytesIO()
202+
plt.savefig(buffer, format='png')
203+
buffer.seek(0)
204+
img_str = base64.b64encode(buffer.read()).decode()
205+
print(f'<img src="data:image/png;base64,{img_str}"/>')
206+
plt.close()
207+
```
208+
````
209+
210+
### Caching API Calls
211+
212+
````markdown
213+
```python exec="yes" cache="github-stars" refresh="no"
214+
import requests
215+
response = requests.get("https://api.github.com/repos/pawamoy/markdown-exec")
216+
stars = response.json()["stargazers_count"]
217+
print(f"⭐ **{stars}** stars on GitHub!")
218+
```
219+
````
220+
221+
## Troubleshooting
222+
223+
### Cache Not Working
224+
225+
1. Ensure the cache directory is writable
226+
2. Check that you're using `cache="yes"` or a custom ID
227+
3. Verify the cache directory exists: `ls -la .markdown-exec-cache/`
228+
229+
### Stale Cache Results
230+
231+
1. Use `refresh="yes"` to force re-execution
232+
2. Delete the specific cache file
233+
3. Clear the entire cache directory
234+
235+
### Large Cache Directory
236+
237+
Cache files accumulate over time. Periodically clean up:
238+
239+
```bash
240+
# See cache directory size
241+
du -sh .markdown-exec-cache/
242+
243+
# Remove all cache files
244+
rm -rf .markdown-exec-cache/
245+
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ nav:
2525
- Pyodide: usage/pyodide.md
2626
- Shell: usage/shell.md
2727
- Tree: usage/tree.md
28+
- Caching: usage/caching.md
2829
- Gallery: gallery.md
2930
- API reference: reference/api.md
3031
- Development:

src/markdown_exec/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
Utilities to execute code blocks in Markdown files.
44
"""
55

6+
from markdown_exec._internal.cache import CacheManager, get_cache_manager
67
from markdown_exec._internal.formatters.base import (
78
ExecutionError,
89
base_format,
@@ -29,6 +30,7 @@
2930

3031
__all__ = [
3132
"MARKDOWN_EXEC_AUTO",
33+
"CacheManager",
3234
"ExecutionError",
3335
"HeadingReportingTreeprocessor",
3436
"IdPrependingTreeprocessor",
@@ -43,6 +45,7 @@
4345
"default_tabs",
4446
"formatter",
4547
"formatters",
48+
"get_cache_manager",
4649
"get_logger",
4750
"markdown_config",
4851
"patch_loggers",

0 commit comments

Comments
 (0)