☁️ Cross-Cloud Event-Driven Storage Replicator

This project is a Python service that operates as an event-driven replication worker. It exposes an HTTP endpoint to receive a notification about a new file in an AWS S3 bucket and replicates it to a Google Cloud Storage (GCS) bucket, ensuring the process is both robust and idempotent.

This project was developed as part of a take-home assignment and showcases best practices in API design, cloud integration, and local development workflows.

This README outlines the technical approach, design choices, and provides step-by-step instructions for running and testing the service.

Key Features and Design Approach

The service is built on a foundation of modern Python best practices, and designed with a focus on reliability, developer experience, and production readiness. The architecture is not just reactive; it is self-validating and built to fail fast, preventing it from running in a misconfigured or broken state.

Event-Driven Endpoint: A simple HTTP POST endpoint at /v1/replicate to trigger the replication process based on external events.
Flexible Configuration: A dependency injection pattern allows the service to seamlessly switch between local emulators (MinIO for S3, fake-gcs-server for GCS) and real cloud environments. This is controlled by environment variables, requiring no code changes and enabling safe, isolated testing.

Proactive Startup Validation

A key feature of this service is its proactive health check on startup. Before the API becomes available to accept requests, it performs a series of critical validations:

Configuration Loading: It uses Pydantic to strictly load and validate all required environment variables from .env files. If a required variable is missing, the service provides a clean, human-readable error and exits, preventing runtime failures due to missing configuration.
Endpoint Connectivity Test: It actively attempts to connect to the configured S3 and GCS endpoints (localhost emulators or real cloud services).

This "fail-fast" approach ensures that the service only runs when its core dependencies are available and correctly configured, which is a critical practice for building reliable distributed systems.

Advanced Configuration Management

The service utilizes a sophisticated configuration pattern in app/config.py:

Layered Environments: It correctly loads from .env.local first, allowing developers to easily override production settings for local testing without modifying shared files.
Conditional Validation: It contains business logic to enforce conditional rules, such as requiring GOOGLE_APPLICATION_CREDENTIALS only when not using the local GCS emulator.

High-Throughput and Efficient API Design

Beyond the basic requirements, the API has been enhanced for real-world use cases:

Batch Replication Endpoint: A /v1/replicate/batch endpoint was added to allow clients to replicate multiple files in a single API call. This is far more efficient than sending one request per file, reducing network overhead and improving throughput.
Memory-Efficient Streaming: Files are streamed directly from S3 to GCS without being saved to the local disk. This ensures a minimal memory footprint, allowing the service to handle large files efficiently.

Strategy for Robust Error Handling

Transient network errors are inevitable. The service is designed to be resilient:

Automatic Retries: It uses the tenacity library to automatically retry failed network operations (both downloading from S3 and uploading to GCS).
Exponential Backoff: The waiting time between retries increases exponentially (e.g., 2s, 4s, 8s). This prevents overwhelming a temporarily struggling downstream service and increases the chance of a successful recovery.
Specific Error Handling: The application returns clear HTTP status codes (404 Not Found for missing files, 503 Service Unavailable for connection failures), providing meaningful feedback to the client.

Strategy for Guaranteed Idempotency

An idempotent service guarantees that receiving the same request multiple times produces the same result as receiving it once. This is critical to prevent data duplication and wasted processing.

Core Implementation: The primary strategy is to check for the file's existence in the destination before uploading. Before any replication attempt, the service makes a blob.exists() API call to GCS. If the file is already there, the operation is considered a success, and the service gracefully skips the download and upload steps.
Scaling Considerations and Future Improvements: The current blob.exists() check is simple and effective for this assignment. However, in a high-throughput system processing hundreds of files per second, this approach would introduce a performance bottleneck, as it doubles the number of API calls to GCS for new files (one to check, one to upload).

A more scalable, production-grade solution would involve using an external, high-speed metadata store (like Redis or DynamoDB) to track processed files. The workflow would be:

Receive a request for s3_bucket/s3_key.
Generate a unique key for the file (e.g., s3:source-bucket:path/to/file).
Check for the existence of this key in a Redis set—a millisecond-level operation.
If the key exists, the file has been processed; skip.
If not, perform the replication and add the key to the Redis set upon successful upload.

This improved design decouples the idempotency check from the storage provider, significantly reducing latency and API costs at scale.

Technology Stack

The technologies were chosen to align with modern, high-performance backend development practices.

Technology	Purpose	Justification
FastAPI	Web Framework	For its high performance, automatic data validation with Pydantic, and interactive API documentation.
uv	Package Manager	A next-generation, high-speed package manager that significantly accelerates dependency installation.
Docker	Emulation	For running local, containerized emulators (MinIO & fake-gcs-server), enabling a complete and isolated local development loop.
Pydantic	Data Validation	Used for both request body validation and robust, type-safe settings management from environment variables.
Rich	Console Logging	Provides clean, readable, and beautifully formatted terminal output for a superior developer experience.
Tenacity	Retry Logic	A powerful library for adding robust, declarative retry mechanisms to network operations.
Pre-commit & Ruff	Code Quality	For enforcing a consistent, high-quality codebase with automated linting and formatting on every commit.

API Documentation

The service exposes three primary endpoints. Full interactive documentation is also available at the /docs endpoint when the service is running.

- `GET /`

Confirms that the API is online and returns the currently active configuration, indicating whether the service is connected to local emulators or live cloud environments.

Example Response (Local Emulator Mode)

When the emulator URLs are set in the environment:

{
  "status": "ok",
  "message": "Welcome to the Cross-Cloud Replicator!",
  "current_config": {
    "s3_target": "http://localhost:9000",
    "gcs_target": "http://localhost:4443"
  }
}

Example Response (Production Mode)

When no emulator URLs are set:

{
  "status": "ok",
  "message": "Welcome to the Cross-Cloud Replicator!",
  "current_config": {
    "s3_target": "REAL AWS",
    "gcs_target": "REAL GCS"
  }
}

- `POST /v1/replicate`

Triggers the replication of a single file.

Request Body:

{
  "s3_bucket": "source-bucket",
  "s3_key": "path/to/your/file"
}

Success Responses:
- 200 OK: If the file is successfully replicated or if it already exists in the destination (idempotency).
Error Responses:
- 404 Not Found: If the specified s3_key does not exist in the s3_bucket.
- 503 Service Unavailable: If the service cannot connect to S3 or GCS after multiple retries.

- `POST /v1/replicate/batch`

Triggers the replication of multiple files from the same bucket in one call.

Request Body:

{
  "s3_bucket": "source-bucket",
  "s3_keys": [
    "path/to/file1.txt",
    "path/to/image.jpg",
    "data/report.csv"
  ]
}

Success Response (200 OK): Returns a detailed breakdown of the status for each file.

{
  "status": "completed",
  "results": [
    { "key": "path/to/file1.txt", "status": "success", "message": "Successfully replicated..." },
    { "key": "path/to/image.jpg", "status": "not_found", "error": "Object '...' not found..." }
  ]
}

Automated Code Quality

To ensure code is clean, consistent, and maintainable, this project uses a two-layered approach to automated code quality checks with ruff.

Pre-commit Hooks: The repository is configured with pre-commit hooks that run automatically on every git commit. These hooks format the code and check for linting errors before the code is even committed. This provides immediate feedback to the developer and maintains a high standard of quality on the local machine.
Continuous Integration (CI): A GitHub Actions workflow is defined in .github/workflows/ci.yml. This workflow runs on every push or pull request to the main branch. It performs a fresh installation of dependencies and runs the linter and formatter checks on a clean runner. This serves as a final validation gate to ensure that all code integrated into the main branch adheres to the project's quality standards.

Project Structure

.
├── .github/                    # GitHub Actions CI/CD Workflows
│ └── workflows/
│ └── ci.yml
├── app/                        # Main application source code
│ ├── services/                 # Core business logic
│ │ └── replicator.py
│ ├── config.py                 # Pydantic settings management & validation
│ ├── dependencies.py           # Cloud client dependency injection
│ ├── logging_config.py         # Logging configuration
│ └── main.py                   # FastAPI application and endpoints
├── assets/                     # Asset files (e.g., Sequence Diagram)
├── .env.example                # Example environment file
├── .gitignore
├── .pre-commit-config.yaml     # Configuration for local pre-commit hooks
├── pyproject.toml              # Project definition and dependencies (for uv)
├── README.md
└── uv.lock                     # Lock file for reproducible dependencies

Getting Started: Running the Service Locally

This guide provides a complete, step-by-step walkthrough to get the application running on your local machine using Docker-based emulators.

Step 1: Prerequisites

Ensure you have the following tools installed on your system:

Python (3.11 or newer)
Git for version control
Docker Desktop for running the cloud emulators. Make sure Docker is running.

Step 2: Clone and Install Dependencies

First, clone the repository and set up the Python environment using uv.

Clone the repository:

git clone cross-cloud-replicator
cd cross-cloud-replicator

Create and activate a virtual environment:

uv venv .venv
# On Windows:
.venv\Scripts\activate
# On Linux/macOS:
# source .venv/bin/activate

Install all dependencies (including dev tools):
```
uv pip install -e ".[dev]"
```
Set up the Git hooks (for developers): This installs the pre-commit hooks, which will run automatically to ensure code quality.
```
pre-commit install
```

Step 3: Set Up the Emulated Cloud Environment

This service uses Docker to run local versions of S3 (MinIO) and GCS (fake-gcs-server).

Start the S3 Emulator (MinIO): Open a new terminal and run:

docker run -d --rm -p 9000:9000 -p 9001:9001 --name minio \
  -e "MINIO_ROOT_USER=minioadmin" \
  -e "MINIO_ROOT_PASSWORD=minioadmin" \
  quay.io/minio/minio server /data --console-address ":9001"

The S3 API will be available at http://localhost:9000.
You can access the MinIO web console at http://localhost:9001.

Start the GCS Emulator (fake-gcs-server): In another terminal, run:
```
docker run -d --rm -p 4443:4443 --name fake-gcs-server fsouza/fake-gcs-server
```
- The GCS API will be available at http://localhost:4443.

Step 4: Configure Local Environment Variables

The application uses environment variables for configuration.

Create a local environment file: Copy the .env.example file to a new file named .env.local. This file is ignored by Git and is safe for your local settings.
```
# On Windows
copy .env.example .env.local
# On Linux/macOS
cp .env.example .env.local
```

Verify the content: The default values in .env.example are already configured for the local emulator setup. Your .env.local should look like this:

AWS_ACCESS_KEY_ID="minioadmin"
AWS_SECRET_ACCESS_KEY="minioadmin"
AWS_REGION="us-east-1"
GCS_BUCKET_NAME="destination-bucket"
S3_ENDPOINT_URL="http://localhost:9000"
STORAGE_EMULATOR_HOST="http://localhost:4443"

Step 5: Run the Application

Now, with the environment and dependencies ready, you can start the API service.

Start the FastAPI server:
```
uvicorn app.main:app --reload
```
The API is now running at http://127.0.0.1:8000.
The interactive API documentation is available at http://127.0.0.1:8000/docs.

Step 6: Test the Service

Finally, let's send a request to confirm everything is working end-to-end.

Create test data:
- Navigate to the MinIO console at http://localhost:9001.
- Log in with minioadmin / minioadmin.
- Create a new bucket named source-bucket.
- Inside source-bucket, upload a small test file (e.g., sample.txt).

Send a replication request: Use curl or an API client like Postman to send a POST request to the service.

curl -X POST "http://127.0.0.1:8000/v1/replicate" \
     -H "Content-Type: application/json" \
     -d '{"s3_bucket": "source-bucket", "s3_key": "sample.txt"}'

Verify the result:
- You should receive a 200 OK success response.
- Idempotency Check: Send the exact same request again. You should receive another 200 OK response with a message indicating the file already exists and was skipped. This confirms the idempotency logic is working.

Sequence Diagram

This diagram illustrates the flow for a single replication request, including the idempotency check.

Running in Production

To run the service against real AWS and GCP environments:

Create a .env file from the .env.example template.
Fill in your actual AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, and GCS_BUCKET_NAME.
Ensure you have a gcp-credentials.json file for your service account and that the GOOGLE_APPLICATION_CREDENTIALS variable in the .env file points to it.
Make sure the emulator endpoint URLs (S3_ENDPOINT_URL, STORAGE_EMULATOR_HOST) are not set in the .env file. The application will automatically detect their absence and connect to the real cloud services.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

☁️ Cross-Cloud Event-Driven Storage Replicator

Key Features and Design Approach

Proactive Startup Validation

Advanced Configuration Management

High-Throughput and Efficient API Design

Strategy for Robust Error Handling

Strategy for Guaranteed Idempotency

Technology Stack

API Documentation

- `GET /`

Example Response (Local Emulator Mode)

Example Response (Production Mode)

- `POST /v1/replicate`

- `POST /v1/replicate/batch`

Automated Code Quality

Project Structure

Getting Started: Running the Service Locally

Step 1: Prerequisites

Step 2: Clone and Install Dependencies

Step 3: Set Up the Emulated Cloud Environment

Step 4: Configure Local Environment Variables

Step 5: Run the Application

Step 6: Test the Service

Sequence Diagram

Running in Production

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
app		app
assets		assets
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Anshulgada/cross-cloud-replicator

Folders and files

Latest commit

History

Repository files navigation

☁️ Cross-Cloud Event-Driven Storage Replicator

Key Features and Design Approach

Proactive Startup Validation

Advanced Configuration Management

High-Throughput and Efficient API Design

Strategy for Robust Error Handling

Strategy for Guaranteed Idempotency

Technology Stack

API Documentation

- GET /

Example Response (Local Emulator Mode)

Example Response (Production Mode)

- POST /v1/replicate

- POST /v1/replicate/batch

Automated Code Quality

Project Structure

Getting Started: Running the Service Locally

Step 1: Prerequisites

Step 2: Clone and Install Dependencies

Step 3: Set Up the Emulated Cloud Environment

Step 4: Configure Local Environment Variables

Step 5: Run the Application

Step 6: Test the Service

Sequence Diagram

Running in Production

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

- `GET /`

- `POST /v1/replicate`

- `POST /v1/replicate/batch`

Packages