Code Monkey home page Code Monkey logo

rdkit-orig's Introduction

The file Docs/SoftwareRequirements.txt documents the additional
software that is required to build or use the RDKit.

Build instructions are in the file INSTALL.
The most up-to-date build instructions can be found on the wiki:

Some information about using the software from Python is in the
"Getting Started in Python" document in Docs/Book

If you have questions or suggestions, please subscribe to the
rdkit-discuss mailing list:

Please see the file license.txt for details about the "New BSD"
license which covers this software and its associated data and

# $Id$
# Copyright (C) 2008-2010 Greg Landrum

rdkit-orig's People


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rdkit-orig's Issues

Incorrect InChIs after clearing computed properties

(reported by Francis Atkinson)

from __future__ import print_function

from rdkit import Chem

  Marvin  02211109112D

 13 12  0  0  0  0            999 V2000
   -0.7607  -10.6459    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0457  -10.2343    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.6692  -10.6459    0.0000 C   0  0  1  0  0  0  0  0  0  0  0  0
    1.3843  -10.2343    0.0000 C   0  0  2  0  0  0  0  0  0  0  0  0
    2.0993  -10.6459    0.0000 C   0  0  1  0  0  0  0  0  0  0  0  0
    2.8142  -10.2343    0.0000 C   0  0  1  0  0  0  0  0  0  0  0  0
   -1.4740  -10.2352    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.6692  -11.4731    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    1.3843   -9.4072    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.0993  -11.4731    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.5317  -10.6451    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.2440  -10.2326    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.8132   -9.4072    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  3  4  1  0  0  0  0
  4  9  1  1  0  0  0
  1  2  1  0  0  0  0
  5 10  1  1  0  0  0
  4  5  1  0  0  0  0
  6 11  1  0  0  0  0
 11 12  1  0  0  0  0
  5  6  1  0  0  0  0
  6 13  1  6  0  0  0
  2  3  1  0  0  0  0
  1  7  1  0  0  0  0
  3  8  1  1  0  0  0

main_mol = Chem.DeleteSubstructs(old_mol,Chem.MolFromSmiles('O=[Sb](=O)O'))



BaseFeatures_DIP2_NoMicroSpecies.fdef not parseable

In [7]: ffact = ChemicalFeatures.BuildFeatureFactory('./BaseFeatures_DIP2_NoMicrospecies.fdef')
ValueError                                Traceback (most recent call last)
<ipython-input-7-be7869918589> in <module>()
----> 1 ffact = ChemicalFeatures.BuildFeatureFactory('./BaseFeatures_DIP2_NoMicrospecies.fdef')

ValueError:  pattern->getNumAtoms() != len(feature weight vector)

aromatic Si written in SMILES, but cannot be read

In [2]: Chem.MolFromSmiles('Cc1cc[si](-c2cccc3ccc4cc5ccccc5cc4c32)[si](C)n1')
[04:48:35] SMILES Parse Error: syntax error for input: Cc1cc[si](-c2cccc3ccc4cc5ccccc5cc4c32)[si](C)n1

In [3]: Chem.MolFromSmiles('Cc1cc[Si](-c2cccc3ccc4cc5ccccc5cc4c32)[Si](C)n1')
Out[3]: <rdkit.Chem.rdchem.Mol at 0x242d440>

In [5]: Chem.CanonSmiles('C1=CC=CC=[Si]1')
Out[5]: 'c1cc[si]cc1'

Incorrect atom labels from BRICS

In [4]: m = Chem.MolFromSmiles('CCOC1(C)CCCCC1')

In [5]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(m),True)
Out[5]: '[3*]O[3*].[4*]CC.[4*]C1(C)CCCCC1'

(dupe of issue 287 to experiment with github issue tracking)

SDWriter failing with bad boost::any_cast on windows

[reported by Paul C]

In [10]: from rdkit import Chem

In [11]: from rdkit.Chem import Descriptors

In [12]: from rdkit.ML.Descriptors import MoleculeDescriptors

In [13]: m = Chem.MolFromSmiles('CC')

In [15]: nms=[x[0] for x in Descriptors._descList]

In [16]: calc = MoleculeDescriptors.MolecularDescriptorCalculator(nms)

In [17]: ds= calc.CalcDescriptors(m)

In [18]: w=Chem.SDWriter('blah.sdf')

In [19]: w.write(m)
RuntimeError                              Traceback (most recent call last)
<ipython-input-19-4b04ce05d7ef> in <module>()
----> 1 w.write(m)

RuntimeError: boost::bad_any_cast: failed conversion using boost::any_cast

Stereochemistry lost for reacting atoms that don't change connectivity

Reported by Robert Feinstein in this thread:[email protected]/msg02908.html

# Demo of RDKit reaction transform nuking stereocenters
from rdkit import Chem
from rdkit.Chem import AllChem

# Define simple transform that includes possible stereocenter ([C:2])
rxn = AllChem.ReactionFromSmarts('[C:2][C:1]=O>>[C:2][C:1]=S')

# React achiral mol as test
ps = rxn.RunReactants((Chem.MolFromSmiles('CC=O'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'CC=S'

# React mol with chiral center far removed
ps = rxn.RunReactants((Chem.MolFromSmiles('[Cl][C@H]([Br])CCCC=O'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'S=CCCC[C@H](Cl)Br'

# React mol with chiral center included in transform component
ps = rxn.RunReactants((Chem.MolFromSmiles('[Cl][C@H](C=O)'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'S=CCCl' - chriality has been lost.

Double bond stereochemistry not preserved in reactions.

Reported by Sabrina Syeda.
Thread here:[email protected]/msg03080.html

>>rxn = AllChem.ReactionFromSmarts('[CX4:4][CH1:3]=[CH1:2][CX4:5].[Br:1]>>[C:5][C:2]=[C:3][C:4][Br:1]')
>>r = [Chem.MolFromSmiles('CCC\C=C\C(C)C'), Chem.MolFromSmiles('Br')]
>>ps = rxn.RunReactants(tuple(r))
>> for p in ps:
    ...:     for m in p:
    ...:         print Chem.MolToSmiles(m, isomericSmiles= True)
[out] CCC(Br)C=CC(C)C
[out] CCCC=CC(C)(C)Br

logging in module

inchi.MolFromInchi and inchi.MolToInchiAndAuxInfo contain a 'log(log)' statement. If the 'logLevel' argument is not None an error is produced.

improper behavior for empty SDMolSuppliers

This is reasonable:

[14]>>> s = Chem.SDMolSupplier()

[15]>>> s.SetData("")

StopIteration                             Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-16-5e5e6532ea26> in <module>()
----> 1

StopIteration: End of supplier hit

But this is bad:

[11]>>> s = Chem.SDMolSupplier()

[12]>>> s.SetData("")

[13]>>> len(s)
  [13]: 1

as is this:

[17]>>> s = Chem.SDMolSupplier()

[18]>>> s.SetData("")

[19]>>> s[0]


and this is incomprehensible:

[17]>>> s = Chem.SDMolSupplier()

[18]>>> s.SetData("")

[19]>>> s[0]

[20]>>> s[1]
IndexError                                Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-20-88de191fe097> in <module>()
----> 1 s[1]

IndexError: invalid index

[21]>>> s.SetData("")

[22]>>> s[0]

StopIteration                             Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-23-5e5e6532ea26> in <module>()
----> 1

StopIteration: End of supplier hit

[24]>>> len(s)
  [24]: 2

MolFragmentToSmiles generating non-canonical results

In [3]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(0,1,2))
Out[3]: 'Ccc'

In [4]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(1,2,3))
Out[4]: 'ccC'

In [5]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(1,3,2))
Out[5]: 'ccC'

Cannot generate coordinates for output from DeleteSubstructs

As the example shows, this came from problems with salt stripping and is not helped by sanitization.

In [15]: m = Chem.MolFromSmiles('[I-].C[n+]1c(\\C=C\\2/C=CC=CN2CC=C)sc3ccccc13') 

In [16]: sr = SaltRemover.SaltRemover()

In [17]: nm =sr(m)

In [18]: AllChem.Compute2DCoords(nm)

Pre-condition Violation

Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()

RuntimeError                              Traceback (most recent call last)
<ipython-input-18-fbe6edb91321> in <module>()
----> 1 AllChem.Compute2DCoords(nm)

RuntimeError: Pre-condition Violation

In [19]: Chem.SanitizeMol(nm)
Out[19]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [20]: AllChem.Compute2DCoords(nm)

Pre-condition Violation

Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()

RuntimeError                              Traceback (most recent call last)
<ipython-input-20-fbe6edb91321> in <module>()
----> 1 AllChem.Compute2DCoords(nm)

RuntimeError: Pre-condition Violation

In [21]: p = Chem.MolFromSmiles('[I-]')

In [22]: Chem.DeleteSubstructs(m,p)
Out[22]: <rdkit.Chem.rdchem.Mol at 0x3630980>

In [23]: nm2=Chem.DeleteSubstructs(m,p)

In [24]: AllChem.Compute2DCoords(nm2)

Pre-condition Violation

Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()

RuntimeError                              Traceback (most recent call last)
<ipython-input-24-c2a107eeb439> in <module>()
----> 1 AllChem.Compute2DCoords(nm2)

RuntimeError: Pre-condition Violation

In [25]: Chem.MolToSmiles(nm2,True)
Out[25]: 'C=CCN1C=CC=C/C1=C\\c1sc2ccccc2[n+]1C'

IndexError: invalid index to scalar variable when trying to calculate AUC metrics with rdkit

I want to calculate the ROC curve with rdkit implementation:

rdkit.ML.Scoring.Scoring.CalcAUC(scores, col)

Determines the area under the ROC curve


import rdkit.ML.Scoring.Scoring

rdkit.ML.Scoring.Scoring.CalcAUC(scores, y)

and I get the following error:

IndexError: invalid index to scalar variable.

my data:


array([32.336, 31.894, 31.74 , ..., -0.985, -1.629, -1.82 ])


array(['Inactive', 'Inactive', 'Inactive', ..., 'Inactive', 'Inactive','Inactive'], dtype=object)

I do not know what's wrong.

Bad ring query matches for molecules from MolFromSmarts

In [5]: Chem.MolFromSmiles('c:1:c:c:c:c:c1').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[5]: False

In [6]: Chem.MolFromSmarts('c:1:c:c:c:c:c1').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[6]: True

In [7]: Chem.MolFromSmarts('ccc').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[7]: True

MCS code does not support stereochemistry

Thread here:[email protected]/msg02934.html

In [2]: mol1 = Chem.MolFromSmiles("Fc1ccc(cc1)[C@@]3(OCc2cc(C#N)ccc23)CCCN(C)C") 
In [3]: mol2 = Chem.MolFromSmiles("Fc1ccc(cc1)[C@]3(OCc2cc(C#N)ccc23)CCCN(C)C")

In [4]: from rdkit.Chem import MCS

In [6]: MCS.FindMCS((mol1,mol2))
Out[6]: MCSResult(numAtoms=24, numBonds=26, smarts='[F]-[#6]:1:[#6]:[#6]:[#6](-[#6]-2(-[#6]-[#6]-[#6]-[#7](-[#6])-[#6])-[#8]-[#6]-[#6]:3:[#6]:[#6](:[#6]:[#6]:[#6]:3-2)-[#6]#[#7]):[#6]:[#6]:1', completed=1)

SDWriter initialized on a file object can produce an unhandled C++ exception

Not calling the flush method of an SDWriter initialized on a file object may produce an unhandled exception and terminate the interpreter:

 In [7]: with open('xyz.sdf', 'w') as xyz:
    ...:     w = Chem.SDWriter(xyz)
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
 terminate called after throwing an instance of 'boost::python::error_already_set' 

Hashed topological torsion fingerprints not compatible with old version.


In [9]: AllChem.GetHashedTopologicalTorsionFingerprint(Chem.MolFromSmiles('CCCCO'),nBits=4192).GetNonzeroElements()
Out[9]: {544: 1, 1760: 1}


In [3]: AllChem.GetHashedTopologicalTorsionFingerprint(Chem.MolFromSmiles('CCCCO'),nBits=4192).GetNonzeroElements()
Out[3]: {1974: 1, 3516: 1}

There's no good reason for this to be the case.

MolFromInchi doesn't work

I am using python Python 2.7.3
from rdkit import Chem
m2 = Chem.inchi.MolFromInchi('InChI=1S/C10H9N3O/c1-7-11-10(14)9(13-12-7)8-5-3-2-4-6-8/h2-6H,1H3,(H,11,12,14)')
I got
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'MolFromInchi'
But if I use MolFromSmiles
from rdkit import Chem
m2 = Chem.MolFromSmiles('C1CCC1')
It works.

Transform3D#SetRotation() around arbitrary axis scales coordinates

The method Transform3D#SetRotation(double, Point3D) scales the co-ordinates if the axis vector Point3D is not a unit vector:

/*! \brief set the rotation matrix
* The rotation matrix is set to rotation by th specified angle
* about an arbitrary axis
void SetRotation(double angle, const Point3D &axis);

Either the documentation should state this requirement, or the method implement it.

This can be demonstrated in the Java wrapper as follows:

String molBlock = "Single Atom should be invariate\n     RDKit          3D\n\n  1  0  0  0  0  0  0  0  0  0999 V2000\n"+
"    1.0000    1.0000    1.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\nM  END\n\n$$$$";
ROMol mol = RWMol.MolFromMolBlock(molBlock);

// An axis around which the molecule should be invariate:
Point3D axis = new Point3D(1.0, 1.0, 1.0);

//The transform
Transform3D t3d = new Transform3D();
t3d.SetRotation(Math.PI, axis);

//Apply it...



Single Atom should be invariate
     RDKit          3D

  1  0  0  0  0  0  0  0  0  0999 V2000
    3.0000    3.0000    3.0000 C   0  0  0  0  0  0  0  0  0  0  0  0

The result can be correct in the calling code, but this is not obviously required from the documentation, e.g.:

// An axis around which the molecule should be invariate:
Point3D axis = new Point3D(1.0, 1.0, 1.0);

InChI generation code not recognizing stereo

reported by Jan Holst Jensen

> For example: InChI strings generated for spiro.mol (spiro.mol - attached):
> InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3/t2*7-,8-,9-/m10/s1
> RDKit: InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3

This one still doesn't recognize the stereo. I'll file a bug for it:
In [2]: Chem.MolToInchi(Chem.MolFromMolFile('spiro.mol'))
[09:53:16] WARNING: Omitted undefined stereo
Out[2]: 'InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3'

Here's the file:


 22 24  0  0  0  0  0  0  0  0  1 V2000
    9.2912   -9.4308    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.8009   -6.9967    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    7.7063   -7.9840    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.8986   -9.1405    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    9.1531   -6.3991    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   10.5689   -7.8459    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   11.7408   -6.2812    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   11.8789   -9.3129    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   13.3257   -7.7280    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   15.1335   -6.5716    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
   15.2311   -8.7153    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    9.6519  -14.7621    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    6.1616  -12.3280    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    8.0670  -13.3153    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    6.2593  -14.4717    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    9.5138  -11.7304    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   10.9296  -13.1772    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   12.1015  -11.6125    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   12.2396  -14.6442    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   13.6864  -13.0593    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   15.4941  -11.9029    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   15.5918  -14.0466    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
  3  2  1  0  0  0  0
  3  4  1  1  0  0  0
  1  3  1  0  0  0  0
  5  3  1  0  0  0  0
  6  1  1  0  0  0  0
  5  6  1  0  0  0  0
  7  6  1  0  0  0  0
  6  8  1  1  0  0  0
  9  7  1  0  0  0  0
  8  9  1  0  0  0  0
 10  9  1  0  0  0  0
  9 11  1  1  0  0  0
 14 13  1  0  0  0  0
 14 15  1  1  0  0  0
 12 14  1  0  0  0  0
 16 14  1  0  0  0  0
 17 12  1  0  0  0  0
 16 17  1  0  0  0  0
 18 17  1  0  0  0  0
 17 19  1  1  0  0  0
 20 18  1  0  0  0  0
 19 20  1  0  0  0  0
 21 20  1  0  0  0  0
 20 22  1  1  0  0  0
>  <NAME>


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.