Skip to content

Decide on the most appropriate data orchestration tool #114

@martyngigg

Description

@martyngigg

We currently use a machine with a set of cron jobs running on it to power our ingestion layer. This was fine at the start but is starting to become difficult to manage and as the number of use cases scales this approach does not offer ease of use, scalability or a good user experience.

MVP Requirements:

  • Tasks to be defined as python code/packages
  • Allows a cron-like timed schedule to be created
  • A REST api to trigger jobs
  • Allows one-off jobs triggered manually on request
  • A web-based UI
  • Container-based deployment

Nice to haves:

  • DAG display

Some options to consider:

  • Apache Airflow
  • Dagster
  • Prefect
  • Kestra
  • Luigi

Metadata

Metadata

Assignees

No one assigned

    Labels

    investigationCaptures an idea for an investigation into a topic, e.g. new technology, process.

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions