rdkit / rdkit-orig Goto Github PK
View Code? Open in Web Editor NEWOlder clone of the RDKit subversion repository at http://sourceforge.net/projects/rdkit/
License: Other
Older clone of the RDKit subversion repository at http://sourceforge.net/projects/rdkit/
License: Other
The file Docs/SoftwareRequirements.txt documents the additional software that is required to build or use the RDKit. Build instructions are in the file INSTALL. The most up-to-date build instructions can be found on the wiki: http://code.google.com/p/rdkit/wiki/GettingStarted Some information about using the software from Python is in the "Getting Started in Python" document in Docs/Book If you have questions or suggestions, please subscribe to the rdkit-discuss mailing list: http://lists.sourceforge.net/mailman/listinfo/rdkit-discuss Please see the file license.txt for details about the "New BSD" license which covers this software and its associated data and documents. # $Id$ # Copyright (C) 2008-2010 Greg Landrum
In [3]: m = Chem.MolFromSmiles('CB(O)O')
In [4]: from rdkit.Chem import AllChem
Al
In [5]: AllChem.ComputeGasteigerCharges(m)
In [6]: for a in m.GetAtoms(): print a.GetProp('_GasteigerCharge')
-nan
-nan
-nan
-nan
(reported by Francis Atkinson)
from __future__ import print_function
from rdkit import Chem
old_mol=Chem.MolFromMolBlock("""
Marvin 02211109112D
13 12 0 0 0 0 999 V2000
-0.7607 -10.6459 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.0457 -10.2343 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6692 -10.6459 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0
1.3843 -10.2343 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0
2.0993 -10.6459 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0
2.8142 -10.2343 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0
-1.4740 -10.2352 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6692 -11.4731 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
1.3843 -9.4072 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.0993 -11.4731 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.5317 -10.6451 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.2440 -10.2326 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.8132 -9.4072 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3 4 1 0 0 0 0
4 9 1 1 0 0 0
1 2 1 0 0 0 0
5 10 1 1 0 0 0
4 5 1 0 0 0 0
6 11 1 0 0 0 0
11 12 1 0 0 0 0
5 6 1 0 0 0 0
6 13 1 6 0 0 0
2 3 1 0 0 0 0
1 7 1 0 0 0 0
3 8 1 1 0 0 0
M END
""")
main_mol = Chem.DeleteSubstructs(old_mol,Chem.MolFromSmiles('O=[Sb](=O)O'))
main_mol.ClearComputedProps()
Chem.SanitizeMol(main_mol)
print(Chem.MolToSmiles(old_mol,True))
print(Chem.MolToSmiles(main_mol,True))
old_mol.Debug()
main_mol.Debug()
print(Chem.MolToInchi(old_mol))
print(Chem.MolToInchi(new_mol))
In [7]: ffact = ChemicalFeatures.BuildFeatureFactory('./BaseFeatures_DIP2_NoMicrospecies.fdef')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-be7869918589> in <module>()
----> 1 ffact = ChemicalFeatures.BuildFeatureFactory('./BaseFeatures_DIP2_NoMicrospecies.fdef')
ValueError: pattern->getNumAtoms() != len(feature weight vector)
In [2]: Chem.MolFromSmiles('Cc1cc[si](-c2cccc3ccc4cc5ccccc5cc4c32)[si](C)n1')
[04:48:35] SMILES Parse Error: syntax error for input: Cc1cc[si](-c2cccc3ccc4cc5ccccc5cc4c32)[si](C)n1
In [3]: Chem.MolFromSmiles('Cc1cc[Si](-c2cccc3ccc4cc5ccccc5cc4c32)[Si](C)n1')
Out[3]: <rdkit.Chem.rdchem.Mol at 0x242d440>
In [5]: Chem.CanonSmiles('C1=CC=CC=[Si]1')
Out[5]: 'c1cc[si]cc1'
The hard bits are already there in $RDBASE/rdkit/Chem/ChemUtils/TemplateExpand.py, it just needs a more user-friendly API and an example or two.
This would be a good one from the cookbook based on the contents of this thread:
http://www.mail-archive.com/[email protected]/msg02931.html
Request from Roger Sayle.
Idea is to remove all atoms, bonds, and properties.
In [4]: m = Chem.MolFromSmiles('CCOC1(C)CCCCC1')
In [5]: Chem.MolToSmiles(BRICS.BreakBRICSBonds(m),True)
Out[5]: '[3*]O[3*].[4*]CC.[4*]C1(C)CCCCC1'
(dupe of sf.net issue 287 to experiment with github issue tracking)
[reported by Paul C]
In [10]: from rdkit import Chem
In [11]: from rdkit.Chem import Descriptors
In [12]: from rdkit.ML.Descriptors import MoleculeDescriptors
In [13]: m = Chem.MolFromSmiles('CC')
In [15]: nms=[x[0] for x in Descriptors._descList]
In [16]: calc = MoleculeDescriptors.MolecularDescriptorCalculator(nms)
In [17]: ds= calc.CalcDescriptors(m)
In [18]: w=Chem.SDWriter('blah.sdf')
In [19]: w.write(m)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-19-4b04ce05d7ef> in <module>()
----> 1 w.write(m)
RuntimeError: boost::bad_any_cast: failed conversion using boost::any_cast
In [7]: ps = rxn2.RunReactants((Chem.MolFromSmiles('F[C@H](Cl)(Br)I'),))
[05:01:37] Explicit valence for atom # 1 C, 5, is greater than permitted
Segmentation fault (core dumped)
Reported by Robert Feinstein in this thread: http://www.mail-archive.com/[email protected]/msg02908.html
# Demo of RDKit reaction transform nuking stereocenters
from rdkit import Chem
from rdkit.Chem import AllChem
# Define simple transform that includes possible stereocenter ([C:2])
rxn = AllChem.ReactionFromSmarts('[C:2][C:1]=O>>[C:2][C:1]=S')
# React achiral mol as test
ps = rxn.RunReactants((Chem.MolFromSmiles('CC=O'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'CC=S'
# React mol with chiral center far removed
ps = rxn.RunReactants((Chem.MolFromSmiles('[Cl][C@H]([Br])CCCC=O'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'S=CCCC[C@H](Cl)Br'
# React mol with chiral center included in transform component
ps = rxn.RunReactants((Chem.MolFromSmiles('[Cl][C@H](C=O)'),))
Chem.MolToSmiles( ps[0][0], isomericSmiles=True )
# Output is 'S=CCCl' - chriality has been lost.
reported by Andrew Dalke
(reported by Riccardo Vianello)
There are a set of header files that "make install" does not copy into the install directory when not doing an in-tree install.
Reported by Sabrina Syeda.
Thread here: http://www.mail-archive.com/[email protected]/msg03080.html
>>rxn = AllChem.ReactionFromSmarts('[CX4:4][CH1:3]=[CH1:2][CX4:5].[Br:1]>>[C:5][C:2]=[C:3][C:4][Br:1]')
>>rxn.Initialize()
>>r = [Chem.MolFromSmiles('CCC\C=C\C(C)C'), Chem.MolFromSmiles('Br')]
>>ps = rxn.RunReactants(tuple(r))
>> for p in ps:
...: for m in p:
...: print Chem.MolToSmiles(m, isomericSmiles= True)
...:
[out] CCC(Br)C=CC(C)C
[out] CCCC=CC(C)(C)Br
inchi.MolFromInchi and inchi.MolToInchiAndAuxInfo contain a 'log(log)' statement. If the 'logLevel' argument is not None an error is produced.
This is reasonable:
[14]>>> s = Chem.SDMolSupplier()
[15]>>> s.SetData("")
[16]>>> s.next()
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-16-5e5e6532ea26> in <module>()
----> 1 s.next()
StopIteration: End of supplier hit
But this is bad:
[11]>>> s = Chem.SDMolSupplier()
[12]>>> s.SetData("")
[13]>>> len(s)
[13]: 1
as is this:
[17]>>> s = Chem.SDMolSupplier()
[18]>>> s.SetData("")
[19]>>> s[0]
[20]>>>
and this is incomprehensible:
[17]>>> s = Chem.SDMolSupplier()
[18]>>> s.SetData("")
[19]>>> s[0]
[20]>>> s[1]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-20-88de191fe097> in <module>()
----> 1 s[1]
IndexError: invalid index
[21]>>> s.SetData("")
[22]>>> s[0]
[23]>>> s.next()
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
/scratch/RDKit_sf/<ipython-input-23-5e5e6532ea26> in <module>()
----> 1 s.next()
StopIteration: End of supplier hit
[24]>>> len(s)
[24]: 2
In [3]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(0,1,2))
Out[3]: 'Ccc'
In [4]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(1,2,3))
Out[4]: 'ccC'
In [5]: Chem.MolFragmentToSmiles(Chem.MolFromSmiles('c1c(C)cccc1'),(1,3,2))
Out[5]: 'ccC'
As the example shows, this came from problems with salt stripping and is not helped by sanitization.
In [15]: m = Chem.MolFromSmiles('[I-].C[n+]1c(\\C=C\\2/C=CC=CN2CC=C)sc3ccccc13')
In [16]: sr = SaltRemover.SaltRemover()
In [17]: nm =sr(m)
In [18]: AllChem.Compute2DCoords(nm)
[05:47:08]
****
Pre-condition Violation
Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()
****
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-18-fbe6edb91321> in <module>()
----> 1 AllChem.Compute2DCoords(nm)
RuntimeError: Pre-condition Violation
In [19]: Chem.SanitizeMol(nm)
Out[19]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
In [20]: AllChem.Compute2DCoords(nm)
[05:47:18]
****
Pre-condition Violation
Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()
****
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-20-fbe6edb91321> in <module>()
----> 1 AllChem.Compute2DCoords(nm)
RuntimeError: Pre-condition Violation
In [21]: p = Chem.MolFromSmiles('[I-]')
In [22]: Chem.DeleteSubstructs(m,p)
Out[22]: <rdkit.Chem.rdchem.Mol at 0x3630980>
In [23]: nm2=Chem.DeleteSubstructs(m,p)
In [24]: AllChem.Compute2DCoords(nm2)
[05:47:57]
****
Pre-condition Violation
Violation occurred on line 656 in file /scratch/RDKit_trunk/Code/GraphMol/Depictor/EmbeddedFrag.cpp
Failed Expression: d_eatoms.find(aid) == d_eatoms.end()
****
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-24-c2a107eeb439> in <module>()
----> 1 AllChem.Compute2DCoords(nm2)
RuntimeError: Pre-condition Violation
In [25]: Chem.MolToSmiles(nm2,True)
Out[25]: 'C=CCN1C=CC=C/C1=C\\c1sc2ccccc2[n+]1C'
I want to calculate the ROC curve with rdkit implementation:
rdkit.ML.Scoring.Scoring.CalcAUC(scores, col)
Determines the area under the ROC curve
code:
import rdkit.ML.Scoring.Scoring
rdkit.ML.Scoring.Scoring.CalcAUC(scores, y)
and I get the following error:
IndexError: invalid index to scalar variable.
my data:
scores
array([32.336, 31.894, 31.74 , ..., -0.985, -1.629, -1.82 ])
y
array(['Inactive', 'Inactive', 'Inactive', ..., 'Inactive', 'Inactive','Inactive'], dtype=object)
I do not know what's wrong.
Easy enough to copy in from ROMol.i
In [5]: Chem.MolFromSmiles('c:1:c:c:c:c:c1').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[5]: False
In [6]: Chem.MolFromSmarts('c:1:c:c:c:c:c1').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[6]: True
In [7]: Chem.MolFromSmarts('ccc').HasSubstructMatch(Chem.MolFromSmarts('[R2]~[R1]~[R2]'))
Out[7]: True
Thread here: http://www.mail-archive.com/[email protected]/msg02934.html
In [2]: mol1 = Chem.MolFromSmiles("Fc1ccc(cc1)[C@@]3(OCc2cc(C#N)ccc23)CCCN(C)C")
In [3]: mol2 = Chem.MolFromSmiles("Fc1ccc(cc1)[C@]3(OCc2cc(C#N)ccc23)CCCN(C)C")
In [4]: from rdkit.Chem import MCS
In [6]: MCS.FindMCS((mol1,mol2))
Out[6]: MCSResult(numAtoms=24, numBonds=26, smarts='[F]-[#6]:1:[#6]:[#6]:[#6](-[#6]-2(-[#6]-[#6]-[#6]-[#7](-[#6])-[#6])-[#8]-[#6]-[#6]:3:[#6]:[#6](:[#6]:[#6]:[#6]:3-2)-[#6]#[#7]):[#6]:[#6]:1', completed=1)
In [1]: from rdkit.ML.Data import Quantize
In [2]: d = [0,1,2,3,4]
In [3]: a = [0,0,1,1,1]
In [4]: Quantize.FindVarMultQuantBounds(d,1,a,2)
Out[4]: ([1.5], 0.9709505944546688)
In [5]: d2 = [(x,) for x in d]
In [6]: Quantize.FindVarMultQuantBounds(d2,1,a,2)
Segmentation fault
Not calling the flush
method of an SDWriter
initialized on a file object may produce an unhandled exception and terminate the interpreter:
In [7]: with open('xyz.sdf', 'w') as xyz:
...: w = Chem.SDWriter(xyz)
...: w.write(Chem.MolFromSmiles('c1ccccc1'))
...: w.write(Chem.MolFromSmiles('c1ccccc1'))
...: w.write(Chem.MolFromSmiles('c1ccccc1'))
...: w.write(Chem.MolFromSmiles('c1ccccc1'))
...: w.write(Chem.MolFromSmiles('c1ccccc1'))
...:
terminate called after throwing an instance of 'boost::python::error_already_set'
Aborted
2012_12_1:
In [9]: AllChem.GetHashedTopologicalTorsionFingerprint(Chem.MolFromSmiles('CCCCO'),nBits=4192).GetNonzeroElements()
Out[9]: {544: 1, 1760: 1}
2013_03_1:
In [3]: AllChem.GetHashedTopologicalTorsionFingerprint(Chem.MolFromSmiles('CCCCO'),nBits=4192).GetNonzeroElements()
Out[3]: {1974: 1, 3516: 1}
There's no good reason for this to be the case.
The way things are currently broken with "make clean" should be fixed.
I am using python Python 2.7.3
from rdkit import Chem
m2 = Chem.inchi.MolFromInchi('InChI=1S/C10H9N3O/c1-7-11-10(14)9(13-12-7)8-5-3-2-4-6-8/h2-6H,1H3,(H,11,12,14)')
I got
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'MolFromInchi'
But if I use MolFromSmiles
from rdkit import Chem
m2 = Chem.MolFromSmiles('C1CCC1')
It works.
It's an old story, but it's an irritating one.
The method Transform3D#SetRotation(double, Point3D)
scales the co-ordinates if the axis vector Point3D
is not a unit vector:
rdkit-orig/Code/Geometry/Transform3D.h
Lines 60 to 65 in 57058c8
Either the documentation should state this requirement, or the method implement it.
This can be demonstrated in the Java wrapper as follows:
String molBlock = "Single Atom should be invariate\n RDKit 3D\n\n 1 0 0 0 0 0 0 0 0 0999 V2000\n"+
" 1.0000 1.0000 1.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\nM END\n\n$$$$";
ROMol mol = RWMol.MolFromMolBlock(molBlock);
// An axis around which the molecule should be invariate:
Point3D axis = new Point3D(1.0, 1.0, 1.0);
//The transform
Transform3D t3d = new Transform3D();
t3d.SetRotation(Math.PI, axis);
//Apply it...
mol.transformMolsAtoms(t3d);
System.out.println(mol.MolToMolBlock());
Gives:
Single Atom should be invariate
RDKit 3D
1 0 0 0 0 0 0 0 0 0999 V2000
3.0000 3.0000 3.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
M END
The result can be correct in the calling code, but this is not obviously required from the documentation, e.g.:
...
// An axis around which the molecule should be invariate:
Point3D axis = new Point3D(1.0, 1.0, 1.0);
axis.normalize();
...
Reported by Stephan Reiling: http://www.mail-archive.com/[email protected]/msg02948.html
reported by Jan Holst Jensen
> For example: InChI strings generated for spiro.mol (spiro.mol - attached):
>
> IUPAC:
> InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3/t2*7-,8-,9-/m10/s1
> RDKit: InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3
This one still doesn't recognize the stereo. I'll file a bug for it:
In [2]: Chem.MolToInchi(Chem.MolFromMolFile('spiro.mol'))
[09:53:16] WARNING: Omitted undefined stereo
Out[2]: 'InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3'
Here's the file:
spiro.mol
ACD/Labs0709041010
22 24 0 0 0 0 0 0 0 0 1 V2000
9.2912 -9.4308 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.8009 -6.9967 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
7.7063 -7.9840 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.8986 -9.1405 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
9.1531 -6.3991 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
10.5689 -7.8459 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
11.7408 -6.2812 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
11.8789 -9.3129 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
13.3257 -7.7280 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
15.1335 -6.5716 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
15.2311 -8.7153 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
9.6519 -14.7621 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
6.1616 -12.3280 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
8.0670 -13.3153 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
6.2593 -14.4717 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
9.5138 -11.7304 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
10.9296 -13.1772 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
12.1015 -11.6125 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
12.2396 -14.6442 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
13.6864 -13.0593 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
15.4941 -11.9029 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
15.5918 -14.0466 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
3 2 1 0 0 0 0
3 4 1 1 0 0 0
1 3 1 0 0 0 0
5 3 1 0 0 0 0
6 1 1 0 0 0 0
5 6 1 0 0 0 0
7 6 1 0 0 0 0
6 8 1 1 0 0 0
9 7 1 0 0 0 0
8 9 1 0 0 0 0
10 9 1 0 0 0 0
9 11 1 1 0 0 0
14 13 1 0 0 0 0
14 15 1 1 0 0 0
12 14 1 0 0 0 0
16 14 1 0 0 0 0
17 12 1 0 0 0 0
16 17 1 0 0 0 0
18 17 1 0 0 0 0
17 19 1 1 0 0 0
20 18 1 0 0 0 0
19 20 1 0 0 0 0
21 20 1 0 0 0 0
20 22 1 1 0 0 0
M END
> <NAME>
spiro
$$$$
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.