You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bedrock supports [prompt caching](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) on Anthropic models so you can reuse expensive context across requests. Pydantic AI provides four ways to use prompt caching:
80
+
81
+
1.**Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker to cache everything before it in the current user message.
82
+
2.**Cache System Instructions**: Enable [`BedrockModelSettings.bedrock_cache_instructions`][pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_instructions] to append a cache point after the system prompt.
83
+
3.**Cache Tool Definitions**: Enable [`BedrockModelSettings.bedrock_cache_tool_definitions`][pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_tool_definitions] to cache your tool schemas.
84
+
4.**Cache All Messages**: Set [`BedrockModelSettings.bedrock_cache_messages`][pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_messages] to `True` to automatically cache the last user message.
85
+
86
+
!!! note "No TTL Support"
87
+
Unlike the direct Anthropic API, Bedrock manages cache TTL automatically. All cache settings are boolean only — no `'5m'` or `'1h'` options.
88
+
89
+
!!! note "Minimum Token Threshold"
90
+
AWS only serves cached content once a segment crosses the provider-specific minimum token thresholds (see the [Bedrock prompt caching docs](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html)). Short prompts or tool definitions below those limits will bypass the cache, so don't expect savings for tiny payloads.
91
+
92
+
### Example 1: Automatic Message Caching
93
+
94
+
Use `bedrock_cache_messages` to automatically cache the last user message:
95
+
96
+
```python {test="skip"}
97
+
from pydantic_ai import Agent
98
+
from pydantic_ai.models.bedrock import BedrockModelSettings
Bedrock enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors.
194
+
195
+
#### How Cache Points Are Allocated
196
+
197
+
Cache points can be placed in three locations:
198
+
199
+
1.**System Prompt**: Via `bedrock_cache_instructions` setting (adds cache point to last system prompt block)
200
+
2.**Tool Definitions**: Via `bedrock_cache_tool_definitions` setting (adds cache point to last tool definition)
201
+
3.**Messages**: Via `CachePoint` markers or `bedrock_cache_messages` setting (adds cache points to message content)
202
+
203
+
Each setting uses **at most 1 cache point**, but you can combine them.
204
+
205
+
#### Automatic Cache Point Limiting
206
+
207
+
When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones).
208
+
209
+
```python {test="skip"}
210
+
from pydantic_ai import Agent, CachePoint
211
+
from pydantic_ai.models.bedrock import BedrockModelSettings
- System and tool cache points are **always preserved**
242
+
- The cache point created by `bedrock_cache_messages` is **always preserved** (as it's the newest message cache point)
243
+
- Additional `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded
244
+
- This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching
245
+
77
246
## `provider` argument
78
247
79
248
You can provide a custom `BedrockProvider` via the `provider` argument. This is useful when you want to specify credentials directly or use a custom boto3 client:
0 commit comments