Surface-based search engine for fragment and ligand search.
Code repository for : 'FragmentScope - exploring the fragment space with learned surface representations'
# Create a new environment and install packages
. setup/setup.sh fragmentscope
# Activate the environment and test
conda activate fragmentscope
python setup/test.pyThere is an example of how to use it on izar (in the interactive mode on a GPU node):
Sinteract -g gpu:1 -m 30G -t 01:00:00 -p debug
ENVIRONMENT=fragmentscope
# Fragment Search
bash scripts/search_fragments.sh -p examples/2474_protein.pdb -r examples/2474_fragment.sdf -n 5 -o ./output -e $ENVIRONMENT -q <unique_name>
# Ligand Search
bash scripts/search_ligands.sh -p examples/fibr_protein.pdb -r examples/fibr_ligand.sdf -n 5 -o ./output -e $ENVIRONMENT -q <unique_name>| Flag | Description |
|---|---|
-p |
PDB file with protein |
-r |
SDF file with reference fragment |
-n |
Number of fragments to be returned |
-o |
Output directory |
-q |
Custom name of the query (will be used in output file names) |
- Results will be saved to the directory
./output. - Found fragments will be saved to the files with names
NAME_candidates_N.sdf, whereNis the number of the query patch (there can be several query patches created if the reference fragment interacts with several protein chains). - Patch surfaces will be saved to the files with names
NAME_query_patch_N.npy. - Full surfaces for all protein chains will be saved to the files with names
NAME_surface_C.npy, whereCis chain ID. - To visualize surface files
.npyuse PyMOL pluginpymol/load_surfaces.py:- load script to the PyMOL
- run the following command in PyMOL:
load_surfaces <PATH1>, <PATH2>, <PATH3>, ...(with comma-separated paths of all.npyfiles you want to visualize)
Initial dataset is built using the code from https://github.com/arneschneuing/dmasif-ligand/tree/data/data. Folder data provides all steps to create a database.
Here we start with taking two folders: chains_protonated and ligands.
All the created data:
ROOT_DIR=/path/to/FragmentScopePLE
Tools:
REDUCE=/path/to/reduce
Note: use created environment
python -W ignore prepare_surfaces.py \
--root_dir $ROOT_DIR \
--in_dir ${ROOT_DIR}/chains_protonated \
--checkpoint ${ROOT_DIR}/models/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34 \
--device cuda:0python -W ignore prepare_fragment_patches.py \
--ligands ${ROOT_DIR}/ligands \
--surfaces ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/surfaces \
--patches ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches_plip \
--radius 3.0 \
--interaction_profiles ${ROOT_DIR}/plip_combined.csvCan be done in parallel using sbatch/sbatch_prepare_fragment_patches_jed_parallel.sh followed by merge_fragment_data.py
python -W ignore fragment_search.py \
--input_protein ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches/examples/6u6c_protein.pdb \
--reference_fragment ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches/examples/6u6c_fragment.sdf \
--checkpoint ${ROOT_DIR}/models/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34 \
--reduce $REDUCE \
--patches_dir ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches \
--patch_radius 6.0 \
--output_dir ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/temporary \
--device cuda:0 \
\
--distance_threshold 2.0 \
--ransac_n 3 \
--max_iter 100000 \
--max_validation 10000 \
--icp point-to-pointCode with corresponding steps provided in a folder
./design