Skip to content
265 changes: 181 additions & 84 deletions docs/features/rag/document-extraction/docling.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,43 +19,103 @@ This documentation provides a step-by-step guide to integrating Docling with Ope
- Docker installed on your system
- Docker network set up for Open WebUI

# Integration Steps
## Integration Steps

### Step 1: Run the Docker Command for Docling-Serve
### Step 1: Run Docling-Serve Container

**Basic CPU deployment:**

```bash
docker run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
docker run -p 5001:5001 \
-e DOCLING_SERVE_ENABLE_UI=true \
quay.io/docling-project/docling-serve
```

*With GPU support:
**GPU deployment (NVIDIA CUDA):**

```bash
docker run --gpus all -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve-cu124
docker run --gpus all -p 5001:5001 \
-e DOCLING_SERVE_ENABLE_UI=true \
quay.io/docling-project/docling-serve-cu128
```

**Recommended production deployment with Docker Compose:**

```yaml
version: "3.8"
services:
docling-serve:
image: quay.io/docling-project/docling-serve-cu128:latest
container_name: docling-serve
ports:
- "5001:5001"
environment:
# Enable the web UI for testing
- DOCLING_SERVE_ENABLE_UI=true
# CRITICAL: Required for picture description with external LLM APIs
- DOCLING_SERVE_ENABLE_REMOTE_SERVICES=true
# Maximum wait time for sync requests (seconds) - increase for large documents
- DOCLING_SERVE_MAX_SYNC_WAIT=600
# Number of local engine workers
- DOCLING_SERVE_ENG_LOC_NUM_WORKERS=2
# CPU thread configuration
- OMP_NUM_THREADS=4
- MKL_NUM_THREADS=4
# IMPORTANT: Keep at 1 to avoid "Task Not Found" errors
- UVICORN_WORKERS=1
restart: unless-stopped
# For GPU support with NVIDIA:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
```

### Step 2: Configure Open WebUI to use Docling
:::danger Important: UVICORN_WORKERS Setting

When using `UVICORN_WORKERS` greater than 1 with the default `LocalOrchestrator`, you will encounter **"Task Not Found (404)"** errors. This happens because each worker maintains its own in-memory task store, making tasks created by one worker inaccessible to another.

**Always use `UVICORN_WORKERS=1`** unless you have configured a shared state mechanism like Redis.

:::

### Step 2: Configure Open WebUI

1. Log in to your Open WebUI instance
2. Navigate to **Admin Panel** → **Settings** → **Documents**
3. Change the **Default** content extraction engine dropdown to **Docling**
4. Set the extraction engine URL to `http://host.docker.internal:5001` (Docker) or `http://localhost:5001` (native)
5. Save the changes

### Step 3: Configure Picture Description (Optional)

To enable AI-powered image description within documents:

1. In the **Documents** tab, activate **Describe Pictures in Documents**
2. Choose a description mode: `local` or `API`
- **local**: Vision model runs within the Docling container itself
- **API**: Docling calls an external service (e.g., Ollama, OpenAI-compatible endpoint)

- Log in to your Open WebUI instance.
- Navigate to the `Admin Panel` settings menu.
- Click on `Settings`.
- Click on the `Documents` tab.
- Change the `Default` content extraction engine dropdown to `Docling`.
- Update the context extraction engine URL to `http://host.docker.internal:5001`.
- Save the changes.
:::danger Required for API Mode

### (optional) Step 3: Configure Docling's picture description features
When using `API` mode (calling external services like Ollama), you **MUST** set the following environment variable on docling-serve:

- on the `Documents` tab:
- Activate `Describe Pictures in Documents` button.
- Below, choose a description mode: `local` or `API`
- `local`: vision model will run in the same context as Docling itself
- `API`: Docling will make a call to an external service/container (i.e. Ollama)
- fill in an **object value** as described at https://github.com/docling-project/docling-serve/blob/main/docs/usage.md#picture-description-options
- Save the changes.
```bash
DOCLING_SERVE_ENABLE_REMOTE_SERVICES=true
```

Without this, Docling will reject requests to external services with an `OperationNotAllowed` error.

:::

#### JSON Configuration Examples

#### Make sure the object value is a valid JSON! Working examples below
Make sure your configuration is **valid JSON**!

![image](https://github.com/user-attachments/assets/f6524949-fb47-4686-9c81-6ab8fdda6db1)
**Local Model Configuration:**

```json
{
Expand All @@ -68,80 +128,117 @@ docker run --gpus all -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docli
}
```

![image](https://github.com/user-attachments/assets/982e0081-8c11-457c-b886-af91569e7fef)
**API Configuration (Ollama):**

```json
{
"url": "http://localhost:11434/v1/chat/completions",
"url": "http://host.docker.internal:11434/v1/chat/completions",
"params": {
"model": "qwen2.5vl:7b-q4_K_M"
"model": "llava:7b"
},
"timeout": 60,
"prompt": "Describe this image in great details. "
"prompt": "Describe this image in great detail."
}
```

## Verifying Docling in Docker
## Docling-Serve Environment Variables Reference

| Variable | Default | Description |
|----------|---------|-------------|
| `DOCLING_SERVE_ENABLE_UI` | `false` | Enable the web UI at `/ui` endpoint |
| `DOCLING_SERVE_ENABLE_REMOTE_SERVICES` | `false` | **Required** for API-based picture description |
| `DOCLING_SERVE_MAX_SYNC_WAIT` | `120` | Max seconds to wait for synchronous requests |
| `DOCLING_SERVE_ENG_LOC_NUM_WORKERS` | `1` | Number of local engine workers |
| `DOCLING_SERVE_ARTIFACTS_PATH` | `/app/data` | Path to store model artifacts |
| `UVICORN_WORKERS` | `1` | Number of Uvicorn workers (**keep at 1!**) |
| `OMP_NUM_THREADS` | `4` | OpenMP thread count for CPU processing |
| `MKL_NUM_THREADS` | `4` | Intel MKL thread count |

## Docling Parameters Reference (Open WebUI)

Configure via `DOCLING_PARAMS` JSON in **Admin Settings > Documents** or via environment variable.

| Parameter | Type | Description | Allowed Values |
|-----------|------|-------------|----------------|
| `pdf_backend` | `string` | PDF parsing engine | `dlparse_v1`, `dlparse_v2`, `dlparse_v4`, `pypdfium2` |
| `table_mode` | `string` | Table extraction quality | `fast`, `accurate` |
| `ocr_engine` | `string` | OCR library | `tesseract`, `easyocr`, `ocrmac`, `rapidocr` |
| `do_ocr` | `bool` | Enable OCR | `true`, `false` |
| `force_ocr` | `bool` | Force OCR on digital PDFs | `true`, `false` |
| `pipeline` | `string` | Processing complexity | `standard`, `fast` |
| `ocr_lang` | `list[string]` | OCR languages | See note below |

:::tip Language Codes
- **Tesseract**: 3-letter ISO 639-2 (e.g., `eng`, `deu`, `fra`)
- **EasyOCR**: 2-letter ISO 639-1 (e.g., `en`, `de`, `fr`)
:::

To verify that Docling is working correctly in a Docker environment, you can follow these steps:
**Example Configuration:**

### 1. Start the Docling Docker Container
```json
{
"do_ocr": true,
"pdf_backend": "dlparse_v4",
"table_mode": "accurate",
"ocr_engine": "tesseract",
"ocr_lang": ["eng"]
}
```

First, ensure that the Docling Docker container is running. You can start it using the following command:
## Verifying the Integration

```bash
docker run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
```
1. Access the Docling UI at `http://127.0.0.1:5001/ui`
2. Upload a test document and verify it returns markdown output
3. In Open WebUI, upload a file to a knowledge base and confirm processing completes

This command starts the Docling container and maps port 5001 from the container to port 5001 on your local machine.

### 2. Verify the Server is Running

- Go to `http://127.0.0.1:5001/ui/`
- The URL should lead to a UI to use Docling

### 3. Verify the Integration

- You can try uploading some files via the UI and it should return output in MD format or your desired format

## Docling Parameters Reference

When using Open WebUI with Docling version 2.0.0+, you can configure advanced processing options via the `DOCLING_PARAMS` JSON object in the **Admin Settings > Documents** or via the environment variable.

### Common Parameters

| Parameter | Type | Description | Allowed Values / Examples |
| :--- | :--- | :--- | :--- |
| `pdf_backend` | `string` | The PDF parsing engine to use. | `dlparse_v1`, `dlparse_v2`, `dlparse_v4`, `pypdfium2` |
| `table_mode` | `string` | Quality/speed tradeoff for tables. | `fast`, `accurate` |
| `ocr_engine` | `string` | The OCR library for scanned documents. | `tesseract`, `easyocr`, `ocrmac`, `rapidocr`, `tesserocr` |
| `do_ocr` | `bool` | Whether to perform OCR. | `true`, `false` |
| `force_ocr` | `bool` | Force OCR even on digital PDFs. | `true`, `false` |
| `pipeline` | `string` | Processing pipeline complexity. | `standard`, `fast` |
| `ocr_lang` | `list[string]` | Languages for OCR. | `["eng"]` (3-letter) for Tesseract; `["en"]` (2-letter) for EasyOCR |

:::tip
**Language Codes**: The format of `ocr_lang` depends on the engine:
- **Tesseract / Tesserocr**: Uses 3-letter ISO 639-2 codes (e.g., `eng`, `deu`, `fra`, `spa`).
- **EasyOCR / RapidOCR**: Typically use 2-letter ISO 639-1 codes (e.g., `en`, `de`, `fr`, `es`).
:::
## Troubleshooting

:::tip
**dlparse** vs **dbparse**: Note that the backend names use **`dlparse`** (Deep Learning Parse), not `dbparse`. For modern Docling (v2+), `dlparse_v4` is generally recommended for the best balance of features.
:::

### Example Configuration

```json
{
"do_ocr": true,
"pdf_backend": "dlparse_v4",
"table_mode": "accurate",
"ocr_engine": "tesseract",
"ocr_lang": ["eng"]
}
```

### Conclusion
### "Task result not found. Please wait for a completion status."

**Cause**: Multiple Uvicorn workers with in-memory task storage.

**Solution**: Set `UVICORN_WORKERS=1` in your docling-serve configuration.

### "Connections to remote services is only allowed when set explicitly"

**Cause**: Picture description API mode requires explicit opt-in.

**Solution**: Add `DOCLING_SERVE_ENABLE_REMOTE_SERVICES=true` to your docling-serve environment.

### 404 Not Found on `/v1alpha/convert/file`

**Cause**: Using outdated docling-serve version or Open WebUI version.

**Solution**:
- Update Open WebUI to the latest version (uses `/v1/convert/file`)
- Update docling-serve to v1.0+ (uses `/v1` API)

### Timeout errors on large documents

**Cause**: `DOCLING_SERVE_MAX_SYNC_WAIT` is too low for document processing time.

**Solution**: Increase `DOCLING_SERVE_MAX_SYNC_WAIT` (e.g., `600` for 10 minutes).

### OCR not working or incorrect language detection

**Cause**: Wrong `ocr_lang` format for the selected OCR engine.

**Solution**:
- Tesseract uses 3-letter codes: `["eng", "deu"]`
- EasyOCR uses 2-letter codes: `["en", "de"]`

### "Error calling Docling" with no specific details

**Steps to diagnose:**
1. Check docling-serve logs: `docker logs docling-serve`
2. Test Docling directly via the UI at `http://localhost:5001/ui`
3. Verify network connectivity between Open WebUI and docling-serve containers

## Conclusion

Integrating Docling with Open WebUI enhances document processing capabilities significantly. Key points to remember:

Integrating Docling with Open WebUI is a simple and effective way to enhance document processing and content extraction capabilities. By following the steps in this guide, you can set up Docling as the default extraction engine and verify it’s working smoothly in a Docker environment. Once configured, Docling enables powerful, format-agnostic document parsing to support more advanced AI features in Open WebUI.
- **Always set `UVICORN_WORKERS=1`** to avoid task routing issues
- **Enable `DOCLING_SERVE_ENABLE_REMOTE_SERVICES=true`** when using API-based picture description
- **Increase `DOCLING_SERVE_MAX_SYNC_WAIT`** for large documents
- **Validate JSON syntax** in all configuration fields