-
Notifications
You must be signed in to change notification settings - Fork 5
Process mutations
Max Leiserson edited this page Jun 16, 2016
·
5 revisions
We use the process_mutations.py script to construct mutation datasets from the SNVs in one or more MAF files and alteration files in "event" format. The output is a single file in JSON format. We detail the arguments and file formats below.
Note: You can combine multiple mutation datasets by passing more than you mutation file group (-m).
| Argument | Required (Default) | Description |
|---|---|---|
| -m/--mutation_file_groups | True | Pass one more files for each -m parameter. Files ending with .maf will be processed as MAFs. All other file endings will be processed as event files. |
| -ct/--cancer_types | True | String label for each mutation file group. |
| -o/--output_file | True | Path to output file. Output is in JSON format. |
| -hf/--hypermutator_file | False (None) | Path to hypermutators file. See formatting below. |
| -ivc/--ignored_variant_classes | False ([Silent, Intron, 3'UTR, 5'UTR, IGR, lincRNA, RNA]) |
Exclude mutations of this variant class. |
| -ivt/--ignored_variant_types | False (Germline) | Exclude mutations of this variant type. |
| -ivs/--ignored_validation_statuses | False (Wildtype, Invalid) | Exclude mutations of this validation status. |
| -h/--help | False | Display usage instructions. |
| -v/--verbose | False (0) | Choices: 0, 1, 2, 3, 4, 5. Higher values correspond to more verbose output. |
- MAF file. File in mutation annotation format. See the Mutation Annotation Format Specification for details, including the sets of allowed variant classes, variant types, and validation statuses.
- Event file. Tab-separated "adjacency list" format. Each line lists a sample in the first column, with each other column containing the name of an alteration in that sample (e.g. gene names). Warning: alteration names that appear in both the MAF and event files can overwrite one another.
- Hypermutators file. Text file listing of samples/patients classified as hypermutators. One patient/sample per line.
Last modified: 2:40 PM Thursday, June 16, 2016 (EDT)