-
Notifications
You must be signed in to change notification settings - Fork 23
Description
See required workflow changes in nextstrain/zika#85.
Considerations
The one big difference between mpox build and the zika build is we define multiple builds via different config files in phylogenetic/defaults. This doesn't really mesh well with nextstrain run since the passed config files are expected to be in the user's analysis directory and not the source config files. This was previously discussed on Slack and GitHub where the big takeaway is to define named workflows with a separate Snakefile as an entrypoint for each workflow.
This would look something like this:
phylogenetic/
├── clade-i
│ ├── Snakefile
│ └── config.yaml
├── clade-iib (currently called hmpxv1)
│ ├── Snakefile
│ └── config.yaml
├── lineage-b.1 (currently called hmpxv1_big)
│ ├── Snakefile
│ └── config.yaml
├── mpxv
│ ├── Snakefile
│ └── config.yaml
├── rules
│ ├── annotate_phylogeny.smk
│ ├── construct_phylogeny.smk
│ ├── export.smk
│ ├── main.smk
│ └── prepare_sequences.smk
└── ... (other files)
The phylogenetic/Snakefile file gets moved to rules/main.smk and each phylogenetic/<workflow>/Snakefile would import the main.smk, similar to avian-flu's Snakefile.
Then to run the hmpxv1/clade-iib workflow
with nextstrain build
nextstrain build phylogenetic --snakefile clade-iib/Snakefile
with nextstrain run
nextstrain run mpox phylogenetic/clade-iib <analysis-dir>
Config files
There are shared config files used for all builds (e.g. description.md) and there are also build specific files (e.g. auspice_config.json). I think these can continue to live within phylogenetic/defaults/ but we'll see how the resolve_config_path works with this structure.