-
Notifications
You must be signed in to change notification settings - Fork 20
run: Support unmanaged pathogen repositories with local paths #476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
joverlee521
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worked as expected for me when I tried it locally with zika.
It did flag an unexpected behavior for workflows that are not fully nextstrain run compatible, specifically ones that define workdir in the Snakefile (e.g. rabies/ingest). The outputs were written to the unmanaged pathogen repository instead of the analysis directory. As expected, I see the same behavior with "normally" setup pathogens
nextstrain setup rabies@main
nextstrain run rabies@main ingest /tmp/rabies-ingest
The results end up in ~/.nextstrain/pathogens/rabies/main\=NVQWS3Q\=/ingest/.
Should nextstrain run specifically guard against this by specifying --directory=build_volume?
| self.registration_path = self.path / "nextstrain-pathogen.yaml" | ||
|
|
||
| if self.registration_path.exists(): | ||
| self.registration = read_pathogen_registration(self.registration_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breifly wondered if this should also check for compatibility similar to test_compatibility. Probably not since, we already get a warning if the workflow is not registered
cli/nextstrain/cli/command/run.py
Lines 231 to 232 in 7fb86eb
| if opts.workflow not in pathogen.registered_workflows(): | |
| print(f"The {opts.workflow!r} workflow is not registered as a compatible workflow, but trying to run anyways.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we get #472 in then yeah, I'd add a check for nextstrain run compatibility too.
Well, that's an undesirable behaviour! Good catch. In general, there's only so much we can do if you're ignoring warnings and trying to run non-compatible workflows under That said, I think your suggestion for avoiding this behaviour is good and is a measure we can take.
Probably, yeah. (Explaining aloud, I know you know this…) This essentially tells Snakemake to ignore the workflow's own diff --git a/nextstrain/cli/command/run.py b/nextstrain/cli/command/run.py
index d278fb9..d307780 100644
--- a/nextstrain/cli/command/run.py
+++ b/nextstrain/cli/command/run.py
@@ -274,6 +274,14 @@ def run(opts):
*(["--forceall"]
if opts.force else []),
+ # Explicitly use Snakemake's current working directory as the
+ # workflow's workdir, overriding any "workdir:" directive the workflow
+ # may include. Snakemake uses the cwd by default in the absence of any
+ # "workdir:" directive, but we want to _always_ use it to avoid writing
+ # into the pathogen/workflow source directories if a non-compatible
+ # workflow is run.
+ "--directory=.",
+
# Workdir will be the analysis volume (/nextstrain/build in a
# containerized runtime), so explicitly point to the Snakefile.
"--snakefile=%s/Snakefile" % (the behaviour becomes: $ nextstrain run rabies@main ingest /tmp/rabies
The 'ingest' workflow is not registered as a compatible workflow, but trying to run anyways.
Running the 'ingest' workflow for pathogen rabies@main
WorkflowError in file "/nextstrain/pathogen/ingest/Snakefile", line 9:
Workflow defines configfile defaults/config.yaml but it is not present or accessible (full checked path: /nextstrain/build/defaults/config.yaml).which is probably better even though (because!) it doesn't "work"? |
…rectives in workflows Suggested by @joverlee521 in review¹ to avoid writing into the pathogen/workflow source directories when non-compatible (or broken) workflows are used with `nextstrain run`. ¹ <#476 (review)> Co-authored-by: Jover Lee <joverlee521@gmail.com>
Pushed as acd485d. |
Resolves: #420
Checklist