From 2fbf80618c080e010c7225127f20b314ffc3a613 Mon Sep 17 00:00:00 2001 From: Daniel Olson Date: Thu, 23 Oct 2025 13:11:25 -0600 Subject: [PATCH] Updated README.md Added faster run configurations to the basic usage section --- README.md | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index d06ad29..7f92b51 100644 --- a/README.md +++ b/README.md @@ -13,10 +13,10 @@ cmake . make ``` ## Basic usage -A list of all flags and options can be seen with `ultra -h`. To annotate tandem repeats with ULTRA use `ultra [options] `. The following examples demonstrate common use cases. +A list of all flags and options can be seen with `ultra -h`. To annotate tandem repeats with ULTRA use `ultra [options] `. The following walkthroughs and examples demonstrate common use cases.
-Example 1 - Default settings +Walkthrough 1 - Default settings `examples/example_1.fa` contains randomly generated sequence with three inserted tandem repeats. We can use ULTRA to annotate the file by running: ``` @@ -52,7 +52,7 @@ By default ULTRA will use lower-case masking, although ULTRA will use N-masking
-Example 2 - Large period repeats +Walkthrough 2 - Large period repeats `examples/example_2.fa` contains a period 1000 repeat, which is larger than ULTRA's default maximum detectable repetitive period (100). To find the large period repeat we must adjust ULTRA's maximum detectable repetitive period using the `-p ` option. @@ -77,7 +77,7 @@ period_1000_repeat 0 17999 1000 22938.433594 . 1 0 . ```
-Example 3 - Tuning and FDR +Walkthrough 3 - Tuning and FDR `examples/example_3.fa` contains randomly generated 80% AT rich sequence along with two inserted tandem repeats (an "AAAGC" repeat and an "AAAATAC" repeat). The large AT bias is far outside ULTRA's default expectation, and as a result ULTRA will have a high false discovery rate, as seen by running: ``` @@ -97,6 +97,30 @@ SeqID Start End Period Score Consensus #Subrepeats SubrepeatStarts SubrepeatCon ```
+
+Faster run configurations +All of these examples use 8 threads (-t 8), although increasing the number of threads will generally improve performance up to around 80 threads. These examples also use fewer indel states (-i 3 -d 3); this greatly reduces runtime, although it does (very slightly) reduce sensitivity. + +``` +# max period of 10, good if all you are interested in +# is STRs; this is much faster than fasTAN +ultra --read_all -p 10 -t 8 -i 3 -d 3 -o + +# max period of 100, this is slightly slower than tools like fasTAN +ultra --read_all -p 100 -t 8 -i 3 -d 3 -o + +# max period of 500 (a common max period when using TRF) +ultra --read_all -p 500 -t 8 -i 3 -d 3 -o + +# max period of 2000 -- this will be slow +# You can speed things up with more threads +# (80 threads will be able to annotate the human genome in a few hours) +ultra --read_all -p 2000 -t 8 -i 2 -d 2 -o +``` + +
+ + ## Output formats and tuning guide
ULTRA TSV format