Code Monkey home page Code Monkey logo

rdkit-orig's Introduction

The file Docs/SoftwareRequirements.txt documents the additional
software that is required to build or use the RDKit.

Build instructions are in the file INSTALL.
The most up-to-date build instructions can be found on the wiki:
http://code.google.com/p/rdkit/wiki/GettingStarted

Some information about using the software from Python is in the
"Getting Started in Python" document in Docs/Book

If you have questions or suggestions, please subscribe to the
rdkit-discuss mailing list: 
http://lists.sourceforge.net/mailman/listinfo/rdkit-discuss

Please see the file license.txt for details about the "New BSD"
license which covers this software and its associated data and
documents. 

# $Id$
# Copyright (C) 2008-2010 Greg Landrum

rdkit-orig's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rdkit-orig's Issues

Incorrect InChIs after clearing computed properties

(reported by Francis Atkinson)

from __future__ import print_function

from rdkit import Chem

old_mol=Chem.MolFromMolBlock("""
  Marvin  02211109112D

 13 12  0  0  0  0            999 V2000
   -0.7607  -10.6459    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0457  -10.2343    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.6692  -10.6459    0.0000 C   0  0  1  0  0  0  0  0  0  0  0  0
    1.3843  -10.2343    0.0000 C   0  0  2  0  0  0  0  0  0  0  0  0
    2.0993  -10.6459    0.0000 C   0  0  1  0  0  0  0  0  0  0  0  0
    2.8142  -10.2343    0.0000 C   0  0  1  0  0  0  0  0  0  0  0  0
   -1.4740  -10.2352    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.6692  -11.4731    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    1.3843   -9.4072    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.0993  -11.4731    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.5317  -10.6451    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.2440  -10.2326    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.8132   -9.4072    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  3  4  1  0  0  0  0
  4  9  1  1  0  0  0
  1  2  1  0  0  0  0
  5 10  1  1  0  0  0
  4  5  1  0  0  0  0
  6 11  1  0  0  0  0
 11 12  1  0  0  0  0
  5  6  1  0  0  0  0
  6 13  1  6  0  0  0
  2  3  1  0  0  0  0
  1  7  1  0  0  0  0
  3  8  1  1  0  0  0
M  END
""")



main_mol = Chem.DeleteSubstructs(old_mol,Chem.MolFromSmiles('O=[Sb](=O)O'))
main_mol.ClearComputedProps()
Chem.SanitizeMol(main_mol)
print(Chem.MolToSmiles(old_mol,True))
print(Chem.MolToSmiles(main_mol,True))

old_mol.Debug()
main_mol.Debug()

print(Chem.MolToInchi(old_mol))
print(Chem.MolToInchi(new_mol))


BaseFeatures_DIP2_NoMicroSpecies.fdef not parseable

In [7]: ffact = ChemicalFeatures.BuildFeatureFactory('./BaseFeatures_DIP2_NoMicrospecies.fdef')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-be7869918589> in <module>()
----> 1 ffact = ChemicalFeatures.BuildFeatureFactory('./BaseFeatures_DIP2_NoMicrospecies.fdef')

ValueError:  pattern->getNumAtoms() != len(feature weight vector)

aromatic Si written in SMILES, but cannot be read

In [2]: Chem.MolFromSmiles('Cc1cc[si](-c2cccc3ccc4cc5ccccc5cc4c32)[si](C)n1')
[04:48:35] SMILES Parse Error: syntax error for input: Cc1cc[si](-c2cccc3ccc4cc5ccccc5cc4c32)[si](C)n1

In [3]: Chem.MolFromSmiles('Cc1cc[Si](-c2cccc3ccc4cc5ccccc5cc4c32)[Si](C)n1')
Out[3]: <rdkit.Chem.rdchem.Mol at 0x242d440>

In [5]: Chem.CanonSmiles('C1=CC=CC=[Si]1')
Out[5]: 'c1cc[si]cc1'

Incorrect atom labels from BRICS

In [4]: m = Chem.MolFromSmiles('CCOC1(C)CCCCC1')

In [5]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(m),True)
Out[5]: '[3*]O[3*].[4*]CC.[4*]C1(C)CCCCC1'

(dupe of sf.net issue 287 to experiment with github issue tracking)

SDWriter failing with bad boost::any_cast on windows

[reported by Paul C]

In [10]: from rdkit import Chem

In [11]: from rdkit.Chem import Descriptors

In [12]: from rdkit.ML.Descriptors import MoleculeDescriptors

In [13]: m = Chem.MolFromSmiles('CC')

In [15]: nms=[x[0] for x in Descriptors._descList]

In [16]: calc = MoleculeDescriptors.MolecularDescriptorCalculator(nms)

In [17]: ds= calc.CalcDescriptors(m)

In [18]: w=Chem.SDWriter('blah.sdf')

In [19]: w.write(m)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-19-4b04ce05d7ef> in <module>()
----> 1 w.write(m)

RuntimeError: boost::bad_any_cast: failed conversion using boost::any_cast

Stereochemistry lost for reacting atoms that don't change connectivity

Reported by Robert Feinstein in this thread: http://www.mail-archive.com/[email protected]/msg02908.html

# Demo of RDKit reaction transform nuking stereocenters
from rdkit import Chem
from rdkit.Chem import AllChem

# Define simple transform that includes possible stereocenter ([C:2])
rxn = AllChem.ReactionFromSmarts('[C:2][C:1]=O>>[C:2][C:1]=S')

# React achiral mol as test
ps = rxn.RunReactants((Chem.MolFromSmiles('CC=O'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'CC=S'

# React mol with chiral center far removed
ps = rxn.RunReactants((Chem.MolFromSmiles('[Cl][C@H]([Br])CCCC=O'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'S=CCCC[C@H](Cl)Br'

# React mol with chiral center included in transform component
ps = rxn.RunReactants((Chem.MolFromSmiles('[Cl][C@H](C=O)'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'S=CCCl' - chriality has been lost.

Double bond stereochemistry not preserved in reactions.

Reported by Sabrina Syeda.
Thread here: http://www.mail-archive.com/[email protected]/msg03080.html

>>rxn = AllChem.ReactionFromSmarts('[CX4:4][CH1:3]=[CH1:2][CX4:5].[Br:1]>>[C:5][C:2]=[C:3][C:4][Br:1]')
>>rxn.Initialize()
>>r = [Chem.MolFromSmiles('CCC\C=C\C(C)C'), Chem.MolFromSmiles('Br')]
>>ps = rxn.RunReactants(tuple(r))
>> for p in ps:
    ...:     for m in p:
    ...:         print Chem.MolToSmiles(m, isomericSmiles= True)
    ...:         
[out] CCC(Br)C=CC(C)C
[out] CCCC=CC(C)(C)Br

logging in inchi.py module

inchi.MolFromInchi and inchi.MolToInchiAndAuxInfo contain a 'log(log)' statement. If the 'logLevel' argument is not None an error is produced.

improper behavior for empty SDMolSuppliers

This is reasonable:

[14]>>> s = Chem.SDMolSupplier()

[15]>>> s.SetData("")

[16]>>> s.next()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-16-5e5e6532ea26> in <module>()
----> 1 s.next()

StopIteration: End of supplier hit

But this is bad:

[11]>>> s = Chem.SDMolSupplier()

[12]>>> s.SetData("")

[13]>>> len(s)
  [13]: 1

as is this:

[17]>>> s = Chem.SDMolSupplier()

[18]>>> s.SetData("")

[19]>>> s[0]

[20]>>>

and this is incomprehensible:

[17]>>> s = Chem.SDMolSupplier()

[18]>>> s.SetData("")

[19]>>> s[0]

[20]>>> s[1]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-20-88de191fe097> in <module>()
----> 1 s[1]

IndexError: invalid index

[21]>>> s.SetData("")

[22]>>> s[0]

[23]>>> s.next()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-23-5e5e6532ea26> in <module>()
----> 1 s.next()

StopIteration: End of supplier hit

[24]>>> len(s)
  [24]: 2

MolFragmentToSmiles generating non-canonical results

In [3]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(0,1,2))
Out[3]: 'Ccc'

In [4]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(1,2,3))
Out[4]: 'ccC'

In [5]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(1,3,2))
Out[5]: 'ccC'

Cannot generate coordinates for output from DeleteSubstructs

As the example shows, this came from problems with salt stripping and is not helped by sanitization.

In [15]: m = Chem.MolFromSmiles('[I-].C[n+]1c(\\C=C\\2/C=CC=CN2CC=C)sc3ccccc13') 

In [16]: sr = SaltRemover.SaltRemover()

In [17]: nm =sr(m)

In [18]: AllChem.Compute2DCoords(nm)
[05:47:08] 

****
Pre-condition Violation

Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()
****

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-fbe6edb91321> in <module>()
----> 1 AllChem.Compute2DCoords(nm)

RuntimeError: Pre-condition Violation

In [19]: Chem.SanitizeMol(nm)
Out[19]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [20]: AllChem.Compute2DCoords(nm)
[05:47:18] 

****
Pre-condition Violation

Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()
****

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-20-fbe6edb91321> in <module>()
----> 1 AllChem.Compute2DCoords(nm)

RuntimeError: Pre-condition Violation

In [21]: p = Chem.MolFromSmiles('[I-]')

In [22]: Chem.DeleteSubstructs(m,p)
Out[22]: <rdkit.Chem.rdchem.Mol at 0x3630980>

In [23]: nm2=Chem.DeleteSubstructs(m,p)

In [24]: AllChem.Compute2DCoords(nm2)
[05:47:57] 

****
Pre-condition Violation

Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()
****

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-24-c2a107eeb439> in <module>()
----> 1 AllChem.Compute2DCoords(nm2)

RuntimeError: Pre-condition Violation

In [25]: Chem.MolToSmiles(nm2,True)
Out[25]: 'C=CCN1C=CC=C/C1=C\\c1sc2ccccc2[n+]1C'

IndexError: invalid index to scalar variable when trying to calculate AUC metrics with rdkit

I want to calculate the ROC curve with rdkit implementation:

rdkit.ML.Scoring.Scoring.CalcAUC(scores, col)

Determines the area under the ROC curve

code:

import rdkit.ML.Scoring.Scoring

rdkit.ML.Scoring.Scoring.CalcAUC(scores, y)

and I get the following error:

IndexError: invalid index to scalar variable.

my data:

scores

array([32.336, 31.894, 31.74 , ..., -0.985, -1.629, -1.82 ])

y

array(['Inactive', 'Inactive', 'Inactive', ..., 'Inactive', 'Inactive','Inactive'], dtype=object)

I do not know what's wrong.

Bad ring query matches for molecules from MolFromSmarts

In [5]: Chem.MolFromSmiles('c:1:c:c:c:c:c1').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[5]: False

In [6]: Chem.MolFromSmarts('c:1:c:c:c:c:c1').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[6]: True

In [7]: Chem.MolFromSmarts('ccc').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[7]: True

MCS code does not support stereochemistry

Thread here: http://www.mail-archive.com/[email protected]/msg02934.html

In [2]: mol1 = Chem.MolFromSmiles("Fc1ccc(cc1)[C@@]3(OCc2cc(C#N)ccc23)CCCN(C)C") 
In [3]: mol2 = Chem.MolFromSmiles("Fc1ccc(cc1)[C@]3(OCc2cc(C#N)ccc23)CCCN(C)C")

In [4]: from rdkit.Chem import MCS

In [6]: MCS.FindMCS((mol1,mol2))
Out[6]: MCSResult(numAtoms=24, numBonds=26, smarts='[F]-[#6]:1:[#6]:[#6]:[#6](-[#6]-2(-[#6]-[#6]-[#6]-[#7](-[#6])-[#6])-[#8]-[#6]-[#6]:3:[#6]:[#6](:[#6]:[#6]:[#6]:3-2)-[#6]#[#7]):[#6]:[#6]:1', completed=1)

SDWriter initialized on a file object can produce an unhandled C++ exception

Not calling the flush method of an SDWriter initialized on a file object may produce an unhandled exception and terminate the interpreter:

 In [7]: with open('xyz.sdf', 'w') as xyz:
    ...:     w = Chem.SDWriter(xyz)
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     w.write(Chem.MolFromSmiles('c1ccccc1'))
    ...:     
 terminate called after throwing an instance of 'boost::python::error_already_set' 
 Aborted

Hashed topological torsion fingerprints not compatible with old version.

2012_12_1:

In [9]: AllChem.GetHashedTopologicalTorsionFingerprint(Chem.MolFromSmiles('CCCCO'),nBits=4192).GetNonzeroElements()
Out[9]: {544: 1, 1760: 1}

2013_03_1:

In [3]: AllChem.GetHashedTopologicalTorsionFingerprint(Chem.MolFromSmiles('CCCCO'),nBits=4192).GetNonzeroElements()
Out[3]: {1974: 1, 3516: 1}

There's no good reason for this to be the case.

MolFromInchi doesn't work

I am using python Python 2.7.3
from rdkit import Chem
m2 = Chem.inchi.MolFromInchi('InChI=1S/C10H9N3O/c1-7-11-10(14)9(13-12-7)8-5-3-2-4-6-8/h2-6H,1H3,(H,11,12,14)')
I got
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'MolFromInchi'
But if I use MolFromSmiles
from rdkit import Chem
m2 = Chem.MolFromSmiles('C1CCC1')
It works.

Transform3D#SetRotation() around arbitrary axis scales coordinates

The method Transform3D#SetRotation(double, Point3D) scales the co-ordinates if the axis vector Point3D is not a unit vector:

/*! \brief set the rotation matrix
*
* The rotation matrix is set to rotation by th specified angle
* about an arbitrary axis
*/
void SetRotation(double angle, const Point3D &axis);

Either the documentation should state this requirement, or the method implement it.

This can be demonstrated in the Java wrapper as follows:

String molBlock = "Single Atom should be invariate\n     RDKit          3D\n\n  1  0  0  0  0  0  0  0  0  0999 V2000\n"+
"    1.0000    1.0000    1.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\nM  END\n\n$$$$";
ROMol mol = RWMol.MolFromMolBlock(molBlock);

// An axis around which the molecule should be invariate:
Point3D axis = new Point3D(1.0, 1.0, 1.0);

//The transform
Transform3D t3d = new Transform3D();
t3d.SetRotation(Math.PI, axis);

//Apply it...
mol.transformMolsAtoms(t3d);

System.out.println(mol.MolToMolBlock());

Gives:

Single Atom should be invariate
     RDKit          3D

  1  0  0  0  0  0  0  0  0  0999 V2000
    3.0000    3.0000    3.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
M  END

The result can be correct in the calling code, but this is not obviously required from the documentation, e.g.:

...
// An axis around which the molecule should be invariate:
Point3D axis = new Point3D(1.0, 1.0, 1.0);
axis.normalize();
...

InChI generation code not recognizing stereo

reported by Jan Holst Jensen

> For example: InChI strings generated for spiro.mol (spiro.mol - attached):
>
> IUPAC:
> InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3/t2*7-,8-,9-/m10/s1
> RDKit: InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3

This one still doesn't recognize the stereo. I'll file a bug for it:
In [2]: Chem.MolToInchi(Chem.MolFromMolFile('spiro.mol'))
[09:53:16] WARNING: Omitted undefined stereo
Out[2]: 'InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3'

Here's the file:

spiro.mol
  ACD/Labs0709041010  

 22 24  0  0  0  0  0  0  0  0  1 V2000
    9.2912   -9.4308    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.8009   -6.9967    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    7.7063   -7.9840    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.8986   -9.1405    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    9.1531   -6.3991    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   10.5689   -7.8459    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   11.7408   -6.2812    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   11.8789   -9.3129    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   13.3257   -7.7280    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   15.1335   -6.5716    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
   15.2311   -8.7153    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    9.6519  -14.7621    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    6.1616  -12.3280    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    8.0670  -13.3153    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    6.2593  -14.4717    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    9.5138  -11.7304    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   10.9296  -13.1772    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   12.1015  -11.6125    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   12.2396  -14.6442    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   13.6864  -13.0593    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   15.4941  -11.9029    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   15.5918  -14.0466    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
  3  2  1  0  0  0  0
  3  4  1  1  0  0  0
  1  3  1  0  0  0  0
  5  3  1  0  0  0  0
  6  1  1  0  0  0  0
  5  6  1  0  0  0  0
  7  6  1  0  0  0  0
  6  8  1  1  0  0  0
  9  7  1  0  0  0  0
  8  9  1  0  0  0  0
 10  9  1  0  0  0  0
  9 11  1  1  0  0  0
 14 13  1  0  0  0  0
 14 15  1  1  0  0  0
 12 14  1  0  0  0  0
 16 14  1  0  0  0  0
 17 12  1  0  0  0  0
 16 17  1  0  0  0  0
 18 17  1  0  0  0  0
 17 19  1  1  0  0  0
 20 18  1  0  0  0  0
 19 20  1  0  0  0  0
 21 20  1  0  0  0  0
 20 22  1  1  0  0  0
M  END
>  <NAME>
spiro 

$$$$

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.