Skip to content

LPDI-EPFL/FragmentScope

Repository files navigation

FragmentScope

Surface-based search engine for fragment and ligand search.
Code repository for : 'FragmentScope - exploring the fragment space with learned surface representations'

Environment

# Create a new environment and install packages
. setup/setup.sh fragmentscope

# Activate the environment and test
conda activate fragmentscope
python setup/test.py

Usage

There is an example of how to use it on izar (in the interactive mode on a GPU node):

Sinteract -g gpu:1 -m 30G -t 01:00:00 -p debug
ENVIRONMENT=fragmentscope 

# Fragment Search
bash scripts/search_fragments.sh -p examples/2474_protein.pdb -r examples/2474_fragment.sdf -n 5 -o ./output -e $ENVIRONMENT -q <unique_name>

# Ligand Search
bash scripts/search_ligands.sh -p examples/fibr_protein.pdb -r examples/fibr_ligand.sdf -n 5 -o ./output -e $ENVIRONMENT -q <unique_name>

Arguments:

Flag Description
-p PDB file with protein
-r SDF file with reference fragment
-n Number of fragments to be returned
-o Output directory
-q Custom name of the query (will be used in output file names)

Output Format:

  • Results will be saved to the directory ./output.
  • Found fragments will be saved to the files with names NAME_candidates_N.sdf, where N is the number of the query patch (there can be several query patches created if the reference fragment interacts with several protein chains).
  • Patch surfaces will be saved to the files with names NAME_query_patch_N.npy.
  • Full surfaces for all protein chains will be saved to the files with names NAME_surface_C.npy, where C is chain ID.
  • To visualize surface files .npy use PyMOL plugin pymol/load_surfaces.py:
    1. load script to the PyMOL
    2. run the following command in PyMOL: load_surfaces <PATH1>, <PATH2>, <PATH3>, ... (with comma-separated paths of all .npy files you want to visualize)

Setup

Initial dataset is built using the code from https://github.com/arneschneuing/dmasif-ligand/tree/data/data. Folder data provides all steps to create a database.

Here we start with taking two folders: chains_protonated and ligands.

All the created data:

  • ROOT_DIR=/path/to/FragmentScopePLE

Tools:

  • REDUCE=/path/to/reduce

Data

Preparing surfaces

Note: use created environment

python -W ignore prepare_surfaces.py \
                 --root_dir $ROOT_DIR \
                 --in_dir ${ROOT_DIR}/chains_protonated \
                 --checkpoint ${ROOT_DIR}/models/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34 \
                 --device cuda:0

Preparing fragment patches:

python -W ignore prepare_fragment_patches.py \
                 --ligands ${ROOT_DIR}/ligands \
                 --surfaces ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/surfaces \
                 --patches ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches_plip \
                 --radius 3.0 \
                 --interaction_profiles ${ROOT_DIR}/plip_combined.csv

Can be done in parallel using sbatch/sbatch_prepare_fragment_patches_jed_parallel.sh followed by merge_fragment_data.py

Fragment search:

python -W ignore fragment_search.py \
                 --input_protein ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches/examples/6u6c_protein.pdb \
                 --reference_fragment ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches/examples/6u6c_fragment.sdf \
                 --checkpoint ${ROOT_DIR}/models/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34 \
                 --reduce $REDUCE \
                 --patches_dir ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/patches \
                 --patch_radius 6.0 \
                 --output_dir ${ROOT_DIR}/ssl_GB750_ES_AFF_4_cosr01_20231030_163838_epoch_34/temporary \
                 --device cuda:0 \
                 \
                 --distance_threshold 2.0 \
                 --ransac_n 3 \
                 --max_iter 100000 \
                 --max_validation 10000 \
                 --icp point-to-point

Fragment-based small molecules design pipeline

Code with corresponding steps provided in a folder

./design

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •