stsouko / cgrtools Goto Github PK
View Code? Open in Web Editor NEWCGRs, molecules and reactions manipulation
License: GNU Lesser General Public License v3.0
CGRs, molecules and reactions manipulation
License: GNU Lesser General Public License v3.0
Is it possbile to add method, that extracts Bemis-Murcko scaffold from the given molecular graph?
fix AAM
rewrite contour search to edges from nodes
Regarding the tautomerize function, what defines the canonical form? Is this similar to rdkit's empirical scoring scheme or otherwise?
which creates a file
10 10 0 0 0 0 999 V2000
1.8347 1.7649 0.0000 C 0 0 0 0 0 0 0 0 0 1 0 0
0.8175 2.0469 0.0000 C 0 0 0 0 0 0 0 0 0 2 0 0
0.0000 1.3778 0.0000 C 0 0 0 0 0 0 0 0 0 3 0 0
0.4512 0.4166 0.0000 C 0 0 0 0 0 0 0 0 0 4 0 0
1.2496 0.0220 0.0000 C 0 0 0 0 0 0 0 0 0 5 0 0
1.8111 0.7009 0.0000 C 0 0 0 0 0 0 0 0 0 6 0 0
1.3813 -0.6984 0.0000 C 0 0 0 0 0 0 0 0 0 7 0 0
2.0824 -1.1896 0.0000 C 0 0 0 0 0 0 0 0 0 8 0 0
2.1004 -1.9993 0.0000 C 0 0 0 0 0 0 0 0 0 9 0 0
2.8603 -2.4418 0.0000 C 0 0 0 0 0 0 0 0 0 10 0 0
1 2 2 0 0 0 0
1 6 2 0 0 0 0
2 3 2 0 0 0 0
3 4 2 0 0 0 0
4 5 2 0 0 0 0
5 6 2 0 0 0 0
5 7 1 0 0 0 0
7 8 1 0 0 0 0
8 9 1 0 0 0 0
9 10 1 0 0 0 0
M END
$$$$
which im unable to open.
I'm new to CGRtools and when I follow the tutorial given on the documentation it shows me this error [Errno 2] No such file or directory: 'molecules.dat'. I don't know what is causing this problem. Your help will be really appreciated. Thanks.
does it have the function - add atom map id for the reaction?
input reaction smiles:
rxnsmi='O.O=C(COCc1ccccc1)N1CCCc2sc(-c3ccc(OC4CC(N5CCCCC5)C4)cc3)nc21>>O=C(CO)N1CCCc2sc(-c3ccc(OC4CC(N5CCCCC5)C4)cc3)nc21'
addmap(rxnsmi)
output:
'[O:1]=[C:2]([N:3]1[C:4]=2[N:5]=[C:6]([S:7][C:8]2[CH2:9][CH2:10][CH2:11]1)[C:12]3=[CH:13][CH:14]=[C:15]([O:16][CH:17]4[CH2:18][CH:19]([N:20]5[CH2:21][CH2:22][CH2:23][CH2:24][CH2:25]5)[CH2:26]4)[CH:27]=[CH:28]3)[CH2:29][O:30][CH2:31][C:32]6=[CH:33][CH:34]=[CH:35][CH:36]=[CH:37]6.[OH2:38]>>[O:1]=[C:2]([N:3]1[C:4]=2[N:5]=[C:6]([S:7][C:8]2[CH2:9][CH2:10][CH2:11]1)[C:12]3=[CH:13][CH:14]=[C:15]([O:16][CH:17]4[CH2:18][CH:19]([N:20]5[CH2:21][CH2:22][CH2:23][CH2:24][CH2:25]5)[CH2:26]4)[CH:27]=[CH:28]3)[CH2:29][OH:30]'
try to parse 2000 and 3000 when errors found
how to generate ReactionContainer object?
I found the reactions.dat file is supplied,but i want to construct ReactionContainer by myself
the code are:
from CGRtools import CGRpreparer # import of CGRpreparer
from CGRtools.containers import ReactionContainer,MoleculeContainer
from CGRtools.utils.rdkit import from_rdkit_molecule
from rdkit import Chem
def MoleculeContainer_from_smiles(smiles):
m = Chem.MolFromSmiles(smiles)
return from_rdkit_molecule(m)
r1 = MoleculeContainer_from_smiles(r1_smiles)
r2 = MoleculeContainer_from_smiles(r2_smiles)
r3 = MoleculeContainer_from_smiles(r3_smiles)
p1 = MoleculeContainer_from_smiles(p1_smiles)
rc = ReactionContainer(reactants=[r1,r2,r3], products=[p1])
type(rc)
preparer = CGRpreparer()
t_decomposed = preparer.decompose(rc.compose())
and i found errors as follow:
KeyError Traceback (most recent call last)
~/fengjiaxin/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/CGRtools/cache.py in wrapper(self)
46 try:
---> 47 return self.dict[name]
48 except KeyError:
KeyError: '_cached_method_compose'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
in
1 preparer = CGRpreparer()
----> 2 t_decomposed = preparer.decompose(rc.compose())
~/fengjiaxin/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/CGRtools/cache.py in wrapper(self)
47 return self.dict[name]
48 except KeyError:
---> 49 value = self.dict[name] = func(self)
50 return value
51 return wrapper
~/fengjiaxin/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/CGRtools/containers/reaction.py in compose(self)
227 if not all(isinstance(x, (MoleculeContainer, CGRContainer)) for x in rr):
228 raise TypeError('Queries not composable')
--> 229 r = reduce(or_, rr)
230 else:
231 r = MoleculeContainer()
~/fengjiaxin/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/CGRtools/algorithms/union.py in or(self, other)
24 G | H is union of graphs
25 """
---> 26 return self.union(other)
27
28 def union(self, other):
~/fengjiaxin/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/CGRtools/algorithms/union.py in union(self, other)
30 raise TypeError('BaseContainer subclass expected')
31 if self._node.keys() & set(other):
---> 32 raise KeyError('mapping of graphs is not disjoint')
33
34 # dynamic container resolving
KeyError: 'mapping of graphs is not disjoint'
am i wrong with construct the ReactionContainer
if i was wrong,can you supply the right code to construct the ReactionContainer
Thanks very much
data types notebook: m7.substructure([4, 5, 6, 7, 8, 9], as_view=False)
breaks with
TypeError: substructure() got an unexpected keyword argument 'as_view'
Fresh Python3.7 PIP install with code from today.
Add XYZ parser and may be some intermediate parser like one that can work with JSON or dictionary representation, for the case of self-made parsers:
my awesome parser(make dict or JSON) --> intermediate parser(parse redefined fields or I give correspondence of each filed teach in some dict) --> CGR objects
isomorphism of multicomponent graphs
There is a problem with writing of SDF and RDF files if reaction contain any atom with charge +4. For example Ti+4 with 4 anions in one molecule.
Function for refresh or delete outdated index table.
Return RXN block from big RDF file for visualization and search of RXN block with errors and for transfer RXN blocks to other libraries like RDkit, IDNIGO, etc...
canonical formal charge position
А если всего 1 совпадение?
Originally posted by @stsouko in https://github.com/cimm-kzn/CGRtools/pull/51
skip non-covalent rings.
Is it possible to divide SMILESRead in two functions, one for reading SMILES file and convertion from SMILES to graph? Since graph has it is own function to convert into SMILES str(graph) -> SMILES
, it could be more practical to have inverse function like to_graph(SMILES) -> graph
. However now you should always import StringIO and write something like this:
with StringIO(smi) as f, SMILESRead(f) as m:
mol = next(m)
I wanted to create a query from SMILES, so I decided to use molecule.substructure(as_query=True). Then I needed to use isomorphism, but it didn't work as I expected:
Next, I decided to delete all hybridization and neighbors marks, but it didn't change anything:
Finally, when I constructed query myself, it worked as supposed:
So, I don't understand if it is supposed to be like that or it's a bug.
Notebook to reproduce this experiment:
query_bug.ipynb.zip
Hello again, I wanted to ask that can this tool calculate Molecular Descriptors? If yes then where can I find the method of doing so. Thanks
A cgr decomposition function changes charge of the atom and add hydrogens to it.
[Al] -> [AlH3]
invalid charges
Markup of the file is made without this property. It is more logical to make it changeable without marking up again. Especially, when you are working with several million reactions.
Hello!
I have installed CDRtools as described in the READMe. But then I run tests:
pytest --pyargs CGRtools
, I have the error: ModuleNotFoundError: No module named 'lazy_object_proxy'
.
This problem is easily solved by additional installation of this library: pip install lazy_object_proxy
Maybe it is better to add this library depending on the CGRtools? Or maybe add additional command (pip install lazy_object_proxy
) to the README?
implement
Проще показать, что хочу. Хочу чтобы была ID-шка молекулы
ethanol
9 8 0 0 0 0 999 V2000
-1.4732 -4.4786 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.7587 -4.0661 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.0443 -4.4786 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.8857 -3.7641 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.1877 -4.8911 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.0607 -5.1930 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.1461 -3.3376 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.3714 -3.3376 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.0443 -5.3036 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
1 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
2 7 1 0 0 0 0
2 8 1 0 0 0 0
3 9 1 0 0 0 0
M END
$$$$
On delimited data, having issues with SMILESRead i.e. upon calling read(), only a fraction of the results are returned. However, manually iterating over the same with smiles() generally returns them all. While I can use smiles() as a workaround, it would be great to use SMILESRead for parsing the other columns in as metadata.
See below example (using python 3.8.10 and CGRtools 4.1.20):
# This is an excerpt of 1976_Sep2016_USPTOgrants_smiles.rsmi
example_text = """ReactionSmiles PatentNumber ParagraphNum Year TextMinedYield CalculatedYield
[Br:1][CH2:2][CH2:3][OH:4].[CH2:5]([S:7](Cl)(=[O:9])=[O:8])[CH3:6].CCOCC>C(N(CC)CC)C>[CH2:5]([S:7]([O:4][CH2:3][CH2:2][Br:1])(=[O:9])=[O:8])[CH3:6] US03930836 1976
[Br:1][CH2:2][CH2:3][CH2:4][OH:5].[CH3:6][S:7](Cl)(=[O:9])=[O:8].CCOCC>C(N(CC)CC)C>[CH3:6][S:7]([O:5][CH2:4][CH2:3][CH2:2][Br:1])(=[O:9])=[O:8] US03930836 1976
[CH2:1]([Cl:4])[CH2:2][OH:3].CCOCC.[CH2:10]([S:14](Cl)(=[O:16])=[O:15])[CH:11]([CH3:13])[CH3:12]>C(N(CC)CC)C>[CH2:10]([S:14]([O:3][CH2:2][CH2:1][Cl:4])(=[O:16])=[O:15])[CH:11]([CH3:13])[CH3:12] US03930836 1976
[Br:1][CH2:2][CH2:3][OH:4].[CH2:5]([S:7](Cl)(=[O:9])=[O:8])[CH3:6].CCOCC>C(N(CC)CC)C>[CH2:5]([S:7]([O:4][CH2:3][CH2:2][Br:1])(=[O:9])=[O:8])[CH3:6] US03930839 1976
[Br:1][CH2:2][CH2:3][CH2:4][OH:5].[CH3:6][S:7](Cl)(=[O:9])=[O:8].CCOCC>C(N(CC)CC)C>[CH3:6][S:7]([O:5][CH2:4][CH2:3][CH2:2][Br:1])(=[O:9])=[O:8] US03930839 1976
[CH2:1]([Cl:4])[CH2:2][OH:3].CCOCC.[CH2:10]([S:14](Cl)(=[O:16])=[O:15])[CH:11]([CH3:13])[CH3:12]>C(N(CC)CC)C>[CH2:10]([S:14]([O:3][CH2:2][CH2:1][Cl:4])(=[O:16])=[O:15])[CH:11]([CH3:13])[CH3:12] US03930839 1976
[Cl:1][C:2]1[N:3]=[CH:4][C:5]2[C:10]([CH:11]=1)=[C:9]([N+:12]([O-])=O)[CH:8]=[CH:7][CH:6]=2.O.[OH-].[Na+]>C(O)(=O)C.[Fe]>[Cl:1][C:2]1[N:3]=[CH:4][C:5]2[C:10]([CH:11]=1)=[C:9]([NH2:12])[CH:8]=[CH:7][CH:6]=2 |f:2.3| US03930837 1976
[CH3:1][C:2]1[N+:3]([O-])=[CH:4][C:5]2[C:10]([CH:11]=1)=[C:9]([N+:12]([O-:14])=[O:13])[CH:8]=[CH:7][CH:6]=2.P(Cl)(Cl)([Cl:18])=O>>[Cl:18][C:4]1[C:5]2[C:10](=[C:9]([N+:12]([O-:14])=[O:13])[CH:8]=[CH:7][CH:6]=2)[CH:11]=[C:2]([CH3:1])[N:3]=1 US03930837 1976
[CH3:1][C:2]1[N:3]=[CH:4][C:5]2[C:10]([CH:11]=1)=[C:9]([N+:12]([O-:14])=[O:13])[CH:8]=[CH:7][CH:6]=2.[ClH:15]>>[ClH:15].[CH3:1][C:2]1[N:3]=[CH:4][C:5]2[C:10]([CH:11]=1)=[C:9]([N+:12]([O-:14])=[O:13])[CH:8]=[CH:7][CH:6]=2 |f:2.3| US03930837 1976
CC1N=CC2C(C=1)=C([N+]([O-])=O)C=CC=2.[Cl:15][C:16]1[C:25]2[C:20](=[CH:21][CH:22]=[CH:23][CH:24]=2)[CH:19]=[CH:18][N:17]=1>>[ClH:15].[Cl:15][C:16]1[C:25]2[C:20](=[CH:21][CH:22]=[CH:23][CH:24]=2)[CH:19]=[CH:18][N:17]=1 |f:2.3| US03930837 1976
CC1N=CC2C(C=1)=C([N+]([O-])=O)C=CC=2.[Cl:15][C:16]1[CH:25]=[CH:24][C:23]([N+:26]([O-:28])=[O:27])=[C:22]2[C:17]=1[CH:18]=[CH:19][N:20]=[CH:21]2.Cl.CC1N=CC2C(C=1)=C([N+]([O-])=O)C=CC=2.[IH:44]>>[IH:44].[Cl:15][C:16]1[CH:25]=[CH:24][C:23]([N+:26]([O-:28])=[O:27])=[C:22]2[C:17]=1[CH:18]=[CH:19][N:20]=[CH:21]2 |f:2.3,5.6| US03930837 1976
[N+:1]([C:4]1[CH:13]=[CH:12][CH:11]=[C:10]2[C:5]=1[CH:6]=[CH:7][N:8]=[CH:9]2)([O-:3])=[O:2].[BrH:14]>C(O)C>[BrH:14].[N+:1]([C:4]1[CH:13]=[CH:12][CH:11]=[C:10]2[C:5]=1[CH:6]=[CH:7][N:8]=[CH:9]2)([O-:3])=[O:2] |f:3.4| US03930837 1976
[N+](C1C=CC=C2C=1C=CN=C2)([O-])=O.[CH3:14][C:15]1[C:24]2[C:19](=[CH:20][CH:21]=[CH:22][CH:23]=2)[CH:18]=[CH:17][N:16]=1.Br.[Cl:26][C:27]1[C:32]([OH:33])=[C:31]([Cl:34])[C:30]([Cl:35])=[C:29]([Cl:36])[C:28]=1[Cl:37]>>[Cl:26][C:27]1[C:32]([O-:33])=[C:31]([Cl:34])[C:30]([Cl:35])=[C:29]([Cl:36])[C:28]=1[Cl:37].[CH3:14][C:15]1[C:24]2[C:19](=[CH:20][CH:21]=[CH:22][CH:23]=2)[CH:18]=[CH:17][NH+:16]=1 |f:4.5| US03930837 1976
[N+:1]([C:4]1[CH:13]=[CH:12][CH:11]=[C:10]2[C:5]=1[CH:6]=[CH:7][N:8]=[CH:9]2)([O-])=O.NC1C=CC=C2C=1C=CN=C2.Br.[IH:26]>>[IH:26].[IH:26].[NH2:1][C:4]1[CH:13]=[CH:12][CH:11]=[C:10]2[C:5]=1[CH:6]=[CH:7][N:8]=[CH:9]2 |f:4.5.6| US03930837 1976
Cl.[OH:2][C@@H:3]([CH2:21][CH2:22][CH2:23][CH2:24][CH3:25])[CH:4]=[CH:5][CH:6]1[CH:10]=[CH:9][C:8](=[O:11])[CH:7]1[CH2:12][CH:13]=[CH:14][CH2:15][CH2:16][CH2:17][C:18]([OH:20])=[O:19]>C(O)C>[OH:2][C@@H:3]([CH2:21][CH2:22][CH2:23][CH2:24][CH3:25])[CH:4]=[CH:5][CH:6]1[CH2:10][CH2:9][C:8](=[O:11])[CH:7]1[CH2:12][CH:13]=[CH:14][CH2:15][CH2:16][CH2:17][C:18]([OH:20])=[O:19] US03930952 1976
CC(O[CH2:5][C:6]1[CH2:28][S:27][C@@H:9]2[C@H:10]([NH:13]C(C(OC(C)=O)C3C=CC=CC=3)=O)[C:11](=[O:12])[N:8]2[C:7]=1[C:29]([OH:31])=[O:30])=O>O>[CH3:5][C:6]1[CH2:28][S:27][C@@H:9]2[C@H:10]([NH2:13])[C:11](=[O:12])[N:8]2[C:7]=1[C:29]([OH:31])=[O:30] US03930949 1976
[S:1]([O-:5])([O-:4])(=[O:3])=[O:2].[NH4+:6].[NH4+]>O>[S:1](=[O:3])(=[O:2])([OH:5])[O-:4].[NH4+:6].[S:1]([O-:5])([O-:4])(=[O:3])=[O:2].[NH4+:6].[NH4+:6] |f:0.1.2,4.5,6.7.8| US03930988 1976
CO[C:3]1[CH:4]=[C:5]([C:9]2([CH2:12][C:13]([Cl:16])([Cl:15])[Cl:14])[CH2:11][O:10]2)[CH:6]=[CH:7][CH:8]=1.ClC1C=C(C2(CC(Cl)(Cl)Cl)CO2)C=CC=1.FC1C=C(C2(CC(Cl)(Cl)Cl)CO2)C=CC=1.ClC1C=C(C2(CC(Cl)(Cl)Cl)CO2)C=CC=1Cl.C(OC1C=C(C2(CC(Cl)(Cl)Cl)CO2)C=CC=1)C.C(OC1C=C(C2(CC(Cl)(Cl)Cl)CO2)C=CC=1)C1C=CC=CC=1.ClC1C=CC(C2(CC(Cl)(Cl)Cl)CO2)=CC=1.[Br:117]C1C=CC(C2(CC(Cl)(Cl)Cl)CO2)=CC=1>>[Br:117][C:3]1[CH:4]=[C:5]([C:9]2([CH2:12][C:13]([Cl:16])([Cl:15])[Cl:14])[CH2:11][O:10]2)[CH:6]=[CH:7][CH:8]=1 US03930835 1976
[C:1]1(O)[CH:6]=[CH:5][CH:4]=[CH:3][CH:2]=1.[CH2:8]=[O:9].[S:10]([O-:13])([O-:12])=[O:11].[Na+:14].[Na+]>O>[OH:9][CH:8]([S:10]([O-:13])(=[O:12])=[O:11])[C:1]1[CH:6]=[CH:5][CH:4]=[CH:3][CH:2]=1.[Na+:14] |f:2.3.4,6.7| US03931083 1976
[CH3:1][O:2][C:3]1[C:4]([C:13]([OH:15])=O)=[CH:5][C:6]2[C:11]([CH:12]=1)=[CH:10][CH:9]=[CH:8][CH:7]=2.S(Cl)([Cl:18])=O>C1C=CC=CC=1>[CH3:1][O:2][C:3]1[C:4]([C:13]([Cl:18])=[O:15])=[CH:5][C:6]2[C:11]([CH:12]=1)=[CH:10][CH:9]=[CH:8][CH:7]=2 US03931103 1976
[C:1]1([O:7]C(Cl)=O)[CH:6]=[CH:5][CH:4]=[CH:3][CH:2]=1.C(Cl)Cl.[OH2:14].[OH-].[Na+].C(N([CH2:22][CH3:23])CC)C>>[CH:2]1[CH:3]=[C:22]([CH2:23][C:2]2[C:1]([OH:7])=[CH:6][CH:5]=[CH:4][CH:3]=2)[C:5]([OH:14])=[CH:6][CH:1]=1 |f:3.4| US03931108 1976
[CH3:1][C:2]1[C:3](=[CH:7][C:8](=[CH:12][CH:13]=1)[N:9]=[C:10]=[O:11])N=C=O.[NH2:14][C:15]([O:17]CC)=O>>[CH2:2]1[CH:3]([CH2:1][CH:2]2[CH2:13][CH2:12][CH:8]([N:9]=[C:10]=[O:11])[CH2:7][CH2:3]2)[CH2:7][CH2:8][CH:12]([N:14]=[C:15]=[O:17])[CH2:13]1 US03931113 1976
C1CC[CH:4]([N:7]=C=[N:7][CH:4]2CCC[CH2:2][CH2:3]2)[CH2:3][CH2:2]1.[N:16]1([C:24]([O:26][CH2:27][C:28]2[CH:33]=[CH:32][CH:31]=[CH:30][CH:29]=2)=[O:25])[CH2:23][CH2:22][CH2:21][C@H:17]1[C:18]([OH:20])=[O:19].C1C=CC2N(O)N=NC=2C=1.C(N)CC>O1CCCC1>[N:16]1([C:24]([O:26][CH2:27][C:28]2[CH:29]=[CH:30][CH:31]=[CH:32][CH:33]=2)=[O:25])[CH2:23][CH2:22][CH2:21][C@H:17]1[C:18]([OH:20])=[O:19].[CH2:4]([NH-:7])[CH2:3][CH3:2] |f:5.6| US03931139 1976
[N:1]1([C:9]([O:11][CH2:12][C:13]2[CH:18]=[CH:17][CH:16]=[CH:15][CH:14]=2)=[O:10])[CH2:8][CH2:7][CH2:6][C@H:2]1[C:3]([OH:5])=[O:4].C(OC(Cl)=O)C.[CH2:25]([NH2:31])[CH2:26][CH2:27][CH2:28][CH2:29][CH3:30]>O1CCCC1>[N:1]1([C:9]([O:11][CH2:12][C:13]2[CH:14]=[CH:15][CH:16]=[CH:17][CH:18]=2)=[O:10])[CH2:8][CH2:7][CH2:6][C@H:2]1[C:3]([OH:5])=[O:4].[CH2:25]([NH-:31])[CH2:26][CH2:27][CH2:28][CH2:29][CH3:30] |f:4.5| US03931139 1976
[IH:1].CS[C:4]1[NH:5][CH2:6][CH2:7][CH2:8][CH2:9][N:10]=1.C(O)C.O.[NH2:15][NH2:16]>CCOCC>[IH:1].[NH:15]([C:4]1[NH:5][CH2:6][CH2:7][CH2:8][CH2:9][N:10]=1)[NH2:16] |f:0.1,3.4,6.7| US03931152 1976
C1C(=O)N([Br:8])C(=O)C1.[CH3:9][N:10]1[C:16]2[CH:17]=[CH:18][CH:19]=[CH:20][C:15]=2[C:14](=[O:21])[CH2:13][C:12]2[CH:22]=[CH:23][CH:24]=[CH:25][C:11]1=2>CN(C)C=O>[Br:8][C:19]1[CH:18]=[CH:17][C:16]2[N:10]([CH3:9])[C:11]3[CH:25]=[CH:24][CH:23]=[CH:22][C:12]=3[CH2:13][C:14](=[O:21])[C:15]=2[CH:20]=1 US03931151 1976 100.5%
[Br:1][C:2]1[CH:18]=[CH:17][C:5]2[N:6]([CH3:16])[C:7]3[CH:15]=[CH:14][CH:13]=[CH:12][C:8]=3[CH2:9][C:10](=[O:11])[C:4]=2[CH:3]=1.[CH2:19](O)[CH3:20].C([O-])([O-])OCC.C1(C)C=CC(S(O)(=O)=O)=CC=1>C(N(CC)CC)C>[Br:1][C:2]1[CH:18]=[CH:17][C:5]2[N:6]([CH3:16])[C:7]3[CH:15]=[CH:14][CH:13]=[CH:12][C:8]=3[CH:9]=[C:10]([O:11][CH2:19][CH3:20])[C:4]=2[CH:3]=1 US03931151 1976
[CH2:1]([S:3][C:4]1[CH:26]=[CH:25][C:7]2[N:8]([CH3:24])[C:9]3[CH:23]=[CH:22][CH:21]=[CH:20][C:10]=3[CH2:11][C:12](O)([CH2:13][C:14]([O:16][CH2:17][CH3:18])=[O:15])[C:6]=2[CH:5]=1)[CH3:2].Cl>C(O)C>[CH2:1]([S:3][C:4]1[CH:26]=[CH:25][C:7]2[N:8]([CH3:24])[C:9]3[CH:23]=[CH:22][CH:21]=[CH:20][C:10]=3[CH2:11][C:12](=[CH:13][C:14]([O:16][CH2:17][CH3:18])=[O:15])[C:6]=2[CH:5]=1)[CH3:2] US03931151 1976 82.0%
[CH2:1]([S:3][C:4]1[CH:25]=[CH:24][C:7]2[N:8]([CH3:23])[C:9]3[CH:22]=[CH:21][CH:20]=[CH:19][C:10]=3[CH2:11][C:12](=[CH:13][C:14]([O:16]CC)=[O:15])[C:6]=2[CH:5]=1)[CH3:2].[OH-].[K+].Cl>C(O)C>[CH2:1]([S:3][C:4]1[CH:25]=[CH:24][C:7]2[N:8]([CH3:23])[C:9]3[CH:22]=[CH:21][CH:20]=[CH:19][C:10]=3[CH:11]=[C:12]([CH2:13][C:14]([OH:16])=[O:15])[C:6]=2[CH:5]=1)[CH3:2] |f:1.2| US03931151 1976 78.1%
[CH2:1]([S:3][C:4]1[CH:23]=[CH:22][C:7]2[N:8]([CH3:21])[C:9]3[CH:20]=[CH:19][CH:18]=[CH:17][C:10]=3[CH:11]=C(CC(O)=O)[C:6]=2[CH:5]=1)[CH3:2].[CH:24]1[C:29]([N+:30]([O-:32])=[O:31])=[CH:28][CH:27]=[C:26]([OH:33])[CH:25]=1.[CH:34]1(N=C=NC2CCCCC2)CCCCC1.[C:49](OCC)(=[O:51])[CH3:50]>>[CH2:1]([S:3][C:4]1[CH:23]=[CH:22][C:7]2[N:8]([CH3:21])[C:9]3[CH:20]=[CH:19][CH:18]=[CH:17][C:10]=3[CH:11]=[C:50]([C:49]([O:33][C:26]3[CH:27]=[CH:28][C:29]([N+:30]([O-:32])=[O:31])=[CH:24][CH:25]=3)=[O:51])[C:6]=2[C:5]=1[CH3:34])[CH3:2] US03931151 1976
"""
from CGRtools.files import *
from CGRtools import smiles
# Setup example
fname = "first_30_USPTOgrants.rsmi"
f = open(fname, "a")
f.write(example_text)
f.close()
# Try SMILESRead
smi_reader = SMILESRead(fname, header=True)
reader_result = smi_reader.read()
# 7 SMILES retrieved
print(len(reader_result))
for smi in reader_result:
print(smi)
# Read line-by-line with smiles, skip header
f = open(fname, "r")
lines = f.readlines()
smiles_result = []
for line in lines[1:]:
smi = line.split("\t")[0]
parsed_smi = smiles(smi)
smiles_result.append(parsed_smi)
f.close()
# All 30 SMILES retrieved
print(len(smiles_result))
for smi in smiles_result:
print(smi)
запилить поиск дублей присоединяющихся групп. прибавлять к ним уходящую группу.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.