Self-correcting AI agent for natural language to SQL using HuggingFace smolagents and the ReAct framework.
Transform natural language questions into accurate SQL queries with multi-step reasoning, self-correction, and transparent decision-making.
Traditional Text2SQL pipelines generate SQL in one shot with no validation. If the query is wrong, you get incorrect results silently.
Our agent approach uses the ReAct framework to:
- Reason about the query before generating SQL
- Validate results to ensure correctness
- Self-correct when queries are wrong
- Achieve 50-70% higher accuracy on complex queries
User: "Which waiter got the most tips?"
Agent Step 1: Inspect schema, find tips table
Agent Step 2: Generate SQL with GROUP BY and SUM
Agent Step 3: Validate results make sense
Agent Step 4: Return correct answer
- Python 3.10+
- HuggingFace account and API token
git clone https://github.com/Sakeeb91/text2sql-agent.git
cd text2sql-agent
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtcp .env.example .env
# Edit .env with your settingsMinimal .env:
HUGGINGFACE_TOKEN=hf_xxxxx
DATABASE_URL=sqlite:///./data/app.db
TEXT2SQL_MODEL=Qwen/Qwen2.5-Coder-7B-Instruct
AGENT_MODEL_BACKEND=hf_inference
AGENT_INFERENCE_PROVIDER=uvicorn app.main:app --reloadVisit http://localhost:8000/docs for interactive API documentation.
import requests
response = requests.post(
"http://localhost:8000/api/v1/query",
json={
"query": "Show me all active users with their order totals",
"database_id": "default",
"execute": True,
"show_reasoning": True
}
)
result = response.json()
print(f"SQL: {result['sql']}")
print(f"Confidence: {result['confidence']}")curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"query": "Show me top 10 customers by revenue",
"database_id": "default",
"execute": true
}'{
"sql": "SELECT name, SUM(total) as revenue FROM customers JOIN orders ON customers.id = orders.customer_id GROUP BY customers.id ORDER BY revenue DESC LIMIT 10",
"confidence": 0.92,
"results": [...],
"reasoning_trace": [
{"step": 1, "thought": "Need to join customers and orders"},
{"step": 2, "thought": "Aggregate by customer, sort by total"}
]
}Enable multi-db routing and register additional databases per tenant.
- Set environment flags:
MULTIDB_ENABLED=true
MULTIDB_REQUIRE_CONNECTION_TEST=true- Register a database (schema registration primes cache):
curl -X POST http://localhost:8000/api/v1/schema/register \
-H "Content-Type: application/json" \
-d '{
"database_id": "analytics",
"connection_string": "postgresql://user:pass@host:5432/analytics",
"dialect": "postgresql"
}'- Use the
database_idin queries:
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"query": "Show daily signups",
"database_id": "analytics",
"execute": false
}'Single-tenant fallback: set MULTIDB_ENABLED=false to route all requests to the
configured DATABASE_URL.
+----------------------------------------------------------+
| FastAPI REST API |
+----------------------------+-----------------------------+
|
+---------------v----------------+
| Agent Orchestrator |
| (CodeAgent + ReAct Loop) |
+---------------+----------------+
|
+---------------+----------------+
| |
+-------v--------+ +-------v--------+
| SQL Engine | | Validator |
| - Execute SQL | | - Check results|
| - Get schema | | - Suggest fix |
+-------+--------+ +-------+--------+
| |
+---------------+----------------+
|
+---------------v----------------+
| Text2SQL LLM |
| (Qwen, Llama, Mistral, etc.) |
+--------------------------------+
- Multi-step reasoning with transparent thought process
- Self-correction when queries are incorrect
- Schema-aware SQL generation
- Output validation before returning
- Semantic validation feedback for joins, aggregations, and rankings
- SQL injection prevention
- Rate limiting and authentication
- Prometheus metrics, OpenTelemetry tracing, and structured logging
- Multi-database support (PostgreSQL, MySQL, SQLite)
- Query caching for instant repeated responses
- Schema, prompt, and inference caching with TTL controls (Redis or in-memory)
- Model quantization for memory efficiency
- Async processing for high throughput
- Streaming endpoint for long-running queries with batched results
| Endpoint | Description |
|---|---|
POST /api/v1/query |
Generate SQL from natural language |
POST /api/v1/query/batch |
Batch SQL generation |
GET /api/v1/schema/{db_id} |
Get database schema |
POST /api/v1/databases |
Register a database |
GET /api/v1/health |
Health check |
See docs/API.md for complete API reference.
- All
/api/v1endpoints (except/api/v1/health) require authentication. - Use
Authorization: Bearer <jwt>orX-API-Key: <key>headers. - Mutation/management endpoints require scopes in
MUTATION_SCOPES(default:write,admin). - Configure auth, scopes, and rate limiting in docs/CONFIGURATION.md.
# Build
docker build -t text2sql-agent .
# Run
docker run -p 8000:8000 --env-file .env text2sql-agent
# Or use Docker Compose
docker-compose up -d| Document | Description |
|---|---|
| User Guide | How to ask questions and get accurate results |
| Workflow Guide | Developer integration patterns and workflows |
| API Reference | Complete endpoint documentation |
| Configuration | Environment variables and options |
| Deployment | Docker, Kubernetes, production |
| Development | Testing, contributing, code quality |
| Troubleshooting | Common issues and solutions |
| Architecture | System design and patterns |
| Implementation Plan | Detailed technical design |
| Metric | Traditional Pipeline | Agent Approach |
|---|---|---|
| Simple Queries | 85% accuracy | 92% accuracy |
| Complex Queries | 60% accuracy | 90% accuracy |
| Silent Failures | 15% | <1% |
- Fork the repository
- Create a feature branch
- Run tests:
pytest - Run linting:
black . && ruff check . - Submit a Pull Request
See docs/DEVELOPMENT.md for details.
MIT License - see LICENSE for details.