Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [Unreleased]

### Added
- Added `STAC_FASTAPI_ES_MAPPINGS_FILE` environment variable to support file-based custom mappings configuration.
- Added configuration-based support for extending Elasticsearch/OpenSearch index mappings via environment variables, allowing users to customize field mappings without code change through `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` environment variable. Also added `STAC_FASTAPI_ES_DYNAMIC_MAPPING` variable to control dynamic mapping behavior. [#546](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/546)

- Added catalogs route support to enable federated hierarchical catalog browsing and navigation in the STAC API. [#547](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/547)

Expand Down
240 changes: 240 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
- [Ingesting Sample Data CLI Tool](#ingesting-sample-data-cli-tool)
- [Redis for navigation](#redis-for-navigation)
- [Elasticsearch Mappings](#elasticsearch-mappings)
- [Custom Index Mappings](#custom-index-mappings)
- [Managing Elasticsearch Indices](#managing-elasticsearch-indices)
- [Snapshots](#snapshots)
- [Reindexing](#reindexing)
Expand Down Expand Up @@ -463,6 +464,9 @@ You can customize additional settings in your `.env` file:
| `USE_DATETIME_NANOS` | Enables nanosecond precision handling for `datetime` field searches as per the `date_nanos` type. When `False`, it uses 3 millisecond precision as per the type `date`. | `true` | Optional |
| `EXCLUDED_FROM_QUERYABLES` | Comma-separated list of fully qualified field names to exclude from the queryables endpoint and filtering. Use full paths like `properties.auth:schemes,properties.storage:schemes`. Excluded fields and their nested children will not be exposed in queryables. | None | Optional |
| `EXCLUDED_FROM_ITEMS` | Specifies fields to exclude from STAC item responses. Supports comma-separated field names and dot notation for nested fields (e.g., `private_data,properties.confidential,assets.internal`). | `None` | Optional |
| `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` | JSON string of custom Elasticsearch/OpenSearch property mappings to merge with defaults. See [Custom Index Mappings](#custom-index-mappings). | `None` | Optional |
| `STAC_FASTAPI_ES_MAPPINGS_FILE` | Path to a JSON file containing custom Elasticsearch/OpenSearch property mappings to merge with defaults. See [Custom Index Mappings](#custom-index-mappings). | `None` | Optional |
| `STAC_FASTAPI_ES_DYNAMIC_MAPPING` | Controls dynamic mapping behavior for item indices. Values: `true` (default), `false`, or `strict`. See [Custom Index Mappings](#custom-index-mappings). | `true` | Optional |


> [!NOTE]
Expand Down Expand Up @@ -787,6 +791,242 @@ pip install stac-fastapi-elasticsearch[redis]
- The `sfeos_helpers` package contains shared mapping definitions used by both Elasticsearch and OpenSearch backends
- **Customization**: Custom mappings can be defined by extending the base mapping templates.

## Custom Index Mappings

SFEOS provides environment variables to customize Elasticsearch/OpenSearch index mappings without modifying source code. This is useful for:

- Adding STAC extension fields (SAR, Cube, etc.) with proper types
- Optimizing performance by controlling which fields are indexed
- Ensuring correct field types instead of relying on dynamic mapping inference

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` | JSON string of property mappings to merge with defaults | None |
| `STAC_FASTAPI_ES_MAPPINGS_FILE` | Path to a JSON file containing property mappings to merge with defaults | None |
| `STAC_FASTAPI_ES_DYNAMIC_MAPPING` | Controls dynamic mapping: `true`, `false`, or `strict` | `true` |

### Custom Mappings

You can customize the Elasticsearch/OpenSearch mappings by providing a JSON configuration. This can be done via:

1. `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` environment variable (takes precedence)
2. `STAC_FASTAPI_ES_MAPPINGS_FILE` environment variable (file path)

The configuration should have the same structure as the default ES mappings. The custom mappings are **recursively merged** with the defaults at the root level.

#### Merge Behavior

The merge follows these rules:

| Scenario | Result |
|----------|--------|
| Key only in defaults | Preserved |
| Key only in custom | Added |
| Key in both, both are dicts | Recursively merged |
| Key in both, values are not both dicts | **Custom overwrites default** |

**Example - Adding new properties (merged):**

```json
// Default has: {"geometry": {"type": "geo_shape"}}
// Custom has: {"geometry": {"ignore_malformed": true}}
// Result: {"geometry": {"type": "geo_shape", "ignore_malformed": true}}
```

**Example - Overriding a value (replaced):**

```json
// Default has: {"properties": {"datetime": {"type": "date_nanos"}}}
// Custom has: {"properties": {"datetime": {"type": "date"}}}
// Result: {"properties": {"datetime": {"type": "date"}}}
```

#### JSON Structure

The custom JSON should mirror the structure of the default mappings. For STAC item properties, the path is `properties.properties.properties`:

```
{
"numeric_detection": false,
"dynamic_templates": [...],
"properties": { # Top-level ES mapping properties
"id": {...},
"geometry": {...},
"properties": { # STAC item "properties" field
"type": "object",
"properties": { # Nested properties within STAC properties
"datetime": {...},
"sar:frequency_band": {...} # <-- Custom extension fields go here
}
}
}
}
```

**Example - Adding SAR Extension Fields:**

```bash
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
"properties": {
"properties": {
"properties": {
"sar:frequency_band": {"type": "keyword"},
"sar:center_frequency": {"type": "float"},
"sar:polarizations": {"type": "keyword"},
"sar:product_type": {"type": "keyword"}
}
}
}
}'
```

**Example - Adding Cube Extension Fields:**

```bash
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
"properties": {
"properties": {
"properties": {
"cube:dimensions": {"type": "object", "enabled": false},
"cube:variables": {"type": "object", "enabled": false}
}
}
}
}'
```

**Example - Adding geometry options:**

```bash
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
"properties": {
"geometry": {"ignore_malformed": true}
}
}'
```

**Example - Using a mappings file (recommended for complex configurations):**

Instead of passing large JSON blobs via environment variables, you can use a file:

```bash
# Create a mappings file
cat > custom-mappings.json <<EOF
{
"properties": {
"properties": {
"properties": {
"sar:frequency_band": {"type": "keyword"},
"sar:center_frequency": {"type": "float"},
"sar:polarizations": {"type": "keyword"},
"sar:product_type": {"type": "keyword"},
"eo:cloud_cover": {"type": "float"},
"platform": {"type": "keyword"}
}
}
}
}
EOF

# Reference the file
export STAC_FASTAPI_ES_MAPPINGS_FILE=/path/to/custom-mappings.json
```

In Docker Compose, you can mount the file:

```yaml
services:
app-elasticsearch:
volumes:
- ./custom-mappings.json:/app/mappings.json:ro
environment:
- STAC_FASTAPI_ES_MAPPINGS_FILE=/app/mappings.json
```

In Kubernetes, use a ConfigMap:

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: stac-mappings
data:
mappings.json: |
{
"properties": {
"properties": {
"properties": {
"platform": {"type": "keyword"},
"eo:cloud_cover": {"type": "float"}
}
}
}
}
---
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: stac-fastapi
env:
- name: STAC_FASTAPI_ES_MAPPINGS_FILE
value: /etc/stac/mappings.json
volumeMounts:
- name: mappings
mountPath: /etc/stac
volumes:
- name: mappings
configMap:
name: stac-mappings
```

> [!TIP]
> If both `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` and `STAC_FASTAPI_ES_MAPPINGS_FILE` are set, the environment variable takes precedence, allowing quick overrides during testing or troubleshooting.

### Dynamic Mapping Control (`STAC_FASTAPI_ES_DYNAMIC_MAPPING`)

Controls how Elasticsearch/OpenSearch handles fields not defined in the mapping:

| Value | Behavior |
|-------|----------|
| `true` (default) | New fields are automatically added to the mapping. Maintains backward compatibility. |
| `false` | New fields are ignored and not indexed. Documents can still contain these fields, but they won't be searchable. |
| `strict` | Documents with unmapped fields are rejected. |

### Combining Both Variables for Performance Optimization

For large datasets with extensive metadata that isn't queried, you can disable dynamic mapping and define only the fields you need:

```bash
# Disable dynamic mapping
export STAC_FASTAPI_ES_DYNAMIC_MAPPING=false

# Define only queryable fields
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
"properties": {
"properties": {
"properties": {
"platform": {"type": "keyword"},
"eo:cloud_cover": {"type": "float"},
"view:sun_elevation": {"type": "float"}
}
}
}
}'
```

This prevents Elasticsearch from creating mappings for unused metadata fields, reducing index size and improving ingestion performance.

> [!NOTE]
> These environment variables apply to both Elasticsearch and OpenSearch backends. Changes only affect newly created indices. For existing indices, you'll need to reindex using [SFEOS-tools](https://github.com/Healy-Hyperspatial/sfeos-tools).

> [!WARNING]
> Use caution when overriding core fields like `geometry`, `datetime`, or `id`. Incorrect types may cause search failures or data loss.

## Managing Elasticsearch Indices

### Snapshots
Expand Down
Loading