Code Monkey home page Code Monkey logo

Comments (6)

mwojcikowski avatar mwojcikowski commented on June 25, 2024

Hi Thomas,

In oddt we have created additional module for such operations in RDKit in oddt.toolkits.extras.rdkit.fixer. Note it requires Chem.Mol, not oddt.Molecule.

from oddt.toolkits.extras.rdkit.fixer import ExtractPocketAndLigand
mol = Chem.MolFromPDBFile("complex_forscoring_frm.1.min.pdb", sanitize=False)
pocket, ligand = ExtractPocketAndLigand(mol, cutoff=12., ligand_residue='LIG')

Pocket and ligand are still Chem.Mol objects, which you can convert back to ODDT:

oddt_pocket = oddt.toolkit.Molecule(pocket)

If you wish to get wole protein you can set high cutoff.

def ExtractPocketAndLigand(mol, cutoff=12., expandResidues=True,
ligand_residue=None, ligand_residue_blacklist=None,
append_residues=None):
"""Function extracting a ligand (the largest HETATM residue) and the protein
pocket within certain cutoff. The selection of pocket atoms can be expanded
to contain whole residues. The single atom HETATM residues are attributed
to pocket (metals and waters)
Parameters
----------
mol: rdkit.Chem.rdchem.Mol
Molecule with a protein ligand complex
cutoff: float (default=12.)
Distance cutoff for the pocket atoms
expandResidues: bool (default=True)
Expand selection to whole residues within cutoff.
ligand_residue: string (default None)
Residue name which explicitly pint to a ligand(s).
ligand_residue_blacklist: array-like, optional (default None)
List of residues to ignore during ligand lookup.
append_residues: array-like, optional (default None)
List of residues to append to pocket, even if they are HETATM, such
as MSE, ATP, AMP, ADP, etc.
Returns
-------
pocket: rdkit.Chem.rdchem.RWMol
Pocket constructed of protein residues/atoms around ligand
ligand: rdkit.Chem.rdchem.RWMol
Largest HETATM residue contained in input molecule
"""

from oddt.

tevang avatar tevang commented on June 25, 2024

Hi Maciek,

I wrote the following function in line with your suggestion, but I get an error.

import oddt
from oddt.fingerprints import PLEC
from oddt.toolkits.extras.rdkit.fixer import ExtractPocketAndLigand
from rdkit import Chem


def calc_PLEC(complex_pdb, ligand_resname='LIG', cutoff=12.0, depth_ligand=2, depth_protein=4, distance_cutoff=4.5,
                        size=16384, count_bits=True, sparse=False, ignore_hoh=True):
    mol = Chem.MolFromPDBFile(complex_pdb, sanitize=False)
    pocket, ligand = ExtractPocketAndLigand(mol, cutoff=cutoff, ligand_residue=ligand_resname)
    oddt_protein = oddt.toolkit.Molecule(pocket)
    oddt_ligand = oddt.toolkit.Molecule(ligand)
    return PLEC(oddt_ligand, oddt_protein, depth_ligand=depth_ligand, depth_protein=depth_protein,
                    distance_cutoff=distance_cutoff,
                    size=size, count_bits=count_bits, sparse=sparse, ignore_hoh=ignore_hoh)

calc_PLEC('complex_forscoring_frm.1.min.pdb')

The input PDB file is this complex_forscoring_frm.1.min.pdb.gz and the error I get is the following:

RuntimeError: Pre-condition Violation
	RingInfo not initialized
	Violation occurred on line 66 in file Code/GraphMol/RingInfo.cpp
	Failed Expression: df_init
	RDKIT: 2019.03.4
	BOOST: 1_70

Do I need to upgrade RDKit or is it due to something else?

from oddt.

mwojcikowski avatar mwojcikowski commented on June 25, 2024

@tevang It looks like you need to sanitize the molecule first. Or at least generate rings with Chem.FastFindRings(...)

from oddt.

tevang avatar tevang commented on June 25, 2024

@mwojcikowski I use the following workaround that - at least seemingly - does not require sanitization or ring generation. Does it produce correct PLEC fingerprints?

def split_complex_pdb(complex_pdb, ligand_resname='LIG'):
    protein_f = open(complex_pdb.replace('.pdb', '_prot.pdb'), 'w')
    ligand_f = open(complex_pdb.replace('.pdb', '_lig.pdb'), 'w')
    with open(complex_pdb, 'r') as f:
        for line in f:
            if line[17:20] == ligand_resname:
                ligand_f.write(line)
            else:
                protein_f.write(line)
    protein_f.close()
    ligand_f.close()
    return protein_f.name, ligand_f.name


protein_pdb, ligand_pdb = split_complex_pdb(complex_pdb, ligand_resname)
oddt_protein = oddt.toolkit.Molecule(list(oddt.toolkit.readfile('pdb', protein_pdb, sanitize=False))[0])
oddt_ligand = oddt.toolkit.Molecule(list(oddt.toolkit.readfile('pdb', ligand_pdb, sanitize=False))[0])
plec_vector = PLEC(oddt_ligand, oddt_protein, depth_ligand=2, depth_protein=4, distance_cutoff=4.5,
                    size=16384, count_bits=True, sparse=False, ignore_hoh=True)

from oddt.

tevang avatar tevang commented on June 25, 2024

@mwojcikowski essentially my query is whether I am getting correct PLEC vectors without sanitizing the ligands or generating rings with FastFindRings() as you stated. Otherwise, it works.

from oddt.

mwojcikowski avatar mwojcikowski commented on June 25, 2024

@tevang I would say that sanitization is highly recommended. That said being consistent is the key for most use cases, so you might get away with it. Keep in mind that the input is also important (e.g. kekulization), that is why sanitization helps normalising the data.

from oddt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.