Skip to content

brownag/SSURGO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SSURGO

A targets pipeline for building SSURGO databases with DuckDB.

Overview

This R project provides a reproducible pipeline for processing and building SSURGO (Soil Survey Geographic Database) databases using the targets package and DuckDB.

Features

  • Reproducible Workflows: Built on the targets package for reliable, efficient, scalable data pipelines
  • DuckDB Integration: Leverages DuckDB for columnar data storage and querying of spatial and tabular data
  • R-based Pipeline: Written entirely in R, the project leverages the soilDB package for downloading data and creating the database

Installation

To get started, ensure you have R installed, then clone this repository to a folder of your choice.

You want to be sure to have a local repository instance; by default the downloaded data will be stored in "./data/" folder.

The repository is set up as an R package, mainly to manage dependencies. You can install dependencies using remotes::install_deps(). You do not actually need to install the 'SSURGO' R package to run the pipeline, but the dependencies must be present.

if (!requireNamespace("remotes")) install.packages("remotes")
setwd("path/to/SSURGO")
remotes::install_deps()

Once you have dependencies installed, run SSURGO.R to generate a fresh _targets.R file.

You can modify the soil survey areas to include in the database in the first four targets. The default setup assumes you are creating a database with all US States, but you can choose any subset of one or more states, or any alternative method to create the ssas target (a character vector of area symbols).

source("SSURGO.R")

Usage

This project uses the targets package to manage the pipeline.

To run the workflow, be sure your working directory is the ./SSURGO/ folder containing _targets.R.

# Load the targets library
library(targets)

# View the pipeline
tar_visnetwork()  # Visualize the pipeline DAG

# Run the pipeline
tar_make()

Project Structure

SSURGO/
|-- _targets.R           # Main targets pipeline configuration (generated by SSURGO.R)
|-- SSURGO.R             # Entry point for `tar_script()` _targets.R generation
|-- R/                   # Core R functions and wrappers
|-- man/                 # Documentation files
|-- DESCRIPTION          # Package metadata
|-- NAMESPACE            # Package namespace
|-- README.md            # This file

Dependencies

  • targets: Workflow orchestration
  • duckdb: In-process SQL database engine
  • R (>= 4.0.0 recommended)

See the DESCRIPTION file for complete dependency information.

How It Works

The pipeline follows a structured approach:

  1. Data Ingestion: Download and prepare SSURGO data sources
  2. Database Building: Construct optimized DuckDB databases
  3. Output: Generate final database artifacts

Contributing

Please raise any issues on the Issue Tracker.

License

This project is licensed under the terms specified in LICENSE.md.

Author

Andrew G. Brown (@brownag)

References

About

'targets' pipeline for building SSURGO databases with DuckDB

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages