EffectorFisher-core Module (Python Library)

The EffectorFisher module is a Python library used for comparing pangenome-derived protein isoform profiles with host virulence/disease phenotyping, to predict candidate effectors with strong phenotypic-association. EffectorFisher can be used to refine the output of Predector, which combines multiple methods to predict proteins with effector-like properties.

EffectorFisher was developed at the Centre for Crop and Disease Management (CCDM) by Mohitul Hossain within RTP/GRDC-funded Ph.D. project CUR2301-006RSX, with additional support from the Western Australian Agricultural Research Collaboration (WAARC), under the supervision of Dr James Hane and co-supervision of Drs Huyen Phan and Kristina Gagalova. Assistance with code development was provided by Dr Kristina Gagalova and Mr Pavel Misiun, with testing performed by Ms Naomi Gray.

A manuscript describing this method is currently under review, if you use EffectorFisher please check this space for citation details.

Installation

EffectorFisher-core is a command-line tool, written as Python.

Requirements

Python 3.6 or newer. More details can be found here.
pip installed, with pip >= 21.0 recommended. More details about pip installation can be found here.
Internet connection to install from GitHub

Quick installation from GitHub

pip install git+https://github.com/ccdmb/EffectorFisher-core.git

This will:

Download the latest version of the tool
Install all required dependencies
Register the command-line tool effectorfisher-core.py

Manual Installation (From Cloned Repository)

If you prefer to work with the source:

git clone https://github.com/ccdmb/EffectorFisher-core.git
cd EffectorFisher-core
pip install .

To install in development mode (reflects source code changes automatically):

pip install -e .

Input Files

To run this module, you need to provide the following input files:

Effector_variants_PAV_output.txt: This file is a required input for EffectorFisher-core and must be generated by running the EffectorFisher tool. The file can be found in the Final_PAV_result directory upon successful execution of the EffectorFisher pipeline. This file contains the presence–absence variation (PAV) matrix of predicted effector candidates across isolates. Detailed instructions for generating this file can be found in the "EffectorFisher" repo: https://github.com/muhitulh/EffectorFisher/tree/main. Note: Both the EffectorFisher and EffectorFisher-core modules are components of the associated manuscript.
phenotype_data_quantitative.txt or phenotype_data_qualitative.txt:
- phenotype_data_quantitative.txt: This file should contain numeric disease scores. You need to prepare this file as shown in the example.
- phenotype_data_qualitative.txt: This file should contain disease severity levels (high or low). You need to prepare this file as shown in the example.
predector_results.txt: This file is a required input for EffectorFisher-core and must be generated by running the Predector tool. Predector is a published tool in Scientific Reports (link) that prioritizes candidate effector proteins based on a range of effector-like features.
Installation and usage instructions are available in the Predector GitHub repository.
known_effector.txt (optional): You can provide known effector IDs and names in this file, as shown in the example. If this file is not provided, the module will not include known effector ranking in the final output.

Important: Make sure your input file names are the same as mentioned above and that they are located in the subdirectory 00_input_files within your working directory. Alternatively, you can provide the input file paths as command-line arguments (note: still working on it).

Directory Structure

Here's an example of the directory structure for running the EffectorFisher module:

working_directory/
├── 00_input_files/
│   ├── Effector_variants_PAV_output.txt
│   ├── phenotype_data_quantitative.txt (or phenotype_data_qualitative.txt)
│   ├── predector_results.txt
│   └── known_effector.txt (optional)
├── effectorfisher_core.py
└── ...

Make sure to place the input files in the 00_input_files directory within your working directory.

Usage

Run the pipeline with:

effectorfisher_core.py --data-type <qualitative|quantitative> [options]

Basic example

effectorfisher_core.py --data-type quantitative --input-dir 00_input_files/ --save

This will:

Process input files
Apply default filters
Save both intermediate and final output files

Final Output Only (No --save)

effectorfisher_core.py --data-type quantitative --input-dir 00_input_files/

Options

effectorfisher_core.py --help

usage: effectorfisher_core.py [-h] [--data-type {quantitative,qualitative}]
                              [--input-dir INPUT_DIR] [--output-dir OUTPUT_DIR]
                              [--min-variant MIN_VARIANT] [--save]
                              [--cyst CYST] [--total-aa TOTAL_AA]
                              [--pred-score PRED_SCORE] [--p-value P_VALUE]

Process phenotype and variant data for EffectorFisher

optional arguments:
  -h, --help              Show help message and exit
  --data-type             Required. Either `quantitative` or `qualitative`
  --input-dir             Directory containing input files (default: `00_input_files`)
  --output-dir            Directory for output files (default: `output/`)
  --min-variant           Minimum isoform count (default: 5)
  --save                  Save all intermediate and final results
  --cyst                  Minimum cysteine count (default: 2)
  --total-aa              Maximum amino acid length (default: 300)
  --pred-score            Minimum prediction score (default: 2)
  --p-value               P-value threshold (default: 0.05)

Must include:

--data_type <data_type>: Specify the type of phenotypic data you have. Choose either qualitative or quantitative. See the examples in the input_files directory.

Important:

--min_iso <number>: Specify the minimum isoform number (default = 5).

Optional:

--cyst <number>: Specify the cysteine count threshold (default = 2).
--pred_score <number>: Specify the prediction score threshold (default = 2).
--total_aa <number>: Specify the total amino acid count threshold (default = 300).
--p_value <number>: Specify the p-value threshold (default = 0.05).

Example

effectorfisher_core.py --data_type quantitative --min_iso 5 --cyst 2 --pred_score 2 --total_aa 300 --p_value 0.05

Output

Main Output

File Name	Description
`complete_isoform_list.txt`	Complete list of isoforms processed by the module.
`complete_loci_list.txt`	Complete list of loci processed by the module.

Additional Output

File Name	Description
`filtered_loci_list.txt`	List of loci based on the default or specified filters. Alternatively, you can apply filters to `complete_locus_list.txt` as required.
`known_effectors_ranking.txt`	Contains the ranking of known effectors if you provide a known effector input file.

Additional results: Rank the known effectors after filtering.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
00_input_files		00_input_files
img		img
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
effectorfisher_core.py		effectorfisher_core.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EffectorFisher-core Module (Python Library)

Installation

Requirements

Quick installation from GitHub

Manual Installation (From Cloned Repository)

Input Files

Directory Structure

Usage

Basic example

Final Output Only (No --save)

Options

Example

Output

Main Output

Additional Output

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ccdmb/EffectorFisher-core

Folders and files

Latest commit

History

Repository files navigation

EffectorFisher-core Module (Python Library)

Installation

Requirements

Quick installation from GitHub

Manual Installation (From Cloned Repository)

Input Files

Directory Structure

Usage

Basic example

Final Output Only (No --save)

Options

Example

Output

Main Output

Additional Output

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages