Comments (14)
Gee thats all more complicated than it should be. I believe there is still something wrong with the MANIFEST, maybe recursive-include instead of include or something like that. I will do some tests on monday. Thanks for the reports!
from codon-usage-tables.
This may have been a problem with the MANIFEST file. I have now added this file to the manifest and pushed a new version on pypi. Could you try again and let me know if it works?
from codon-usage-tables.
Looks like we are hitting a different error now
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/_f/5kqpr8kx5zl9qjtkzzkb31xw0000gn/T/pip-install-m1m1kpun/python-codon-tables/setup.py", line 19, in <module>
with open(os.path.join('python_codon_tables', 'README.rst'), 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'python_codon_tables/README.rst'
from codon-usage-tables.
The MANIFEST, again... Sorry for these, they are trivial but they are not caught by the test suite because it doesnt use pip. I'll fix it
from codon-usage-tables.
I fixed it on Github but I'll only push the new version to PyPI tomorrow. In the meantime you can also install from directly from Github with pip:
pip install git+https://github.com/Edinburgh-Genome-Foundry/codon-usage-tables.git
from codon-usage-tables.
Thanks!
from codon-usage-tables.
Done. Let me know if I can close this one.
from codon-usage-tables.
Pip install works, but importing fails:
FileNotFoundError: [Errno 2] No such file or directory: '.../miniconda3/envs/codon_harmony/lib/python3.7/site-packages/python_codon_tables/../data/tables'
from codon-usage-tables.
Ok, I managed to reproduce your bug and to fix it, at least on my machine. Could you try a pip --upgrade
and let me know if it now works for you (I am confident it will).
Beware that today I also changed the API in subtle ways so make sure you use the methods highlighted in the example. On the good side, there is a new feature table = download_codons_table(taxid=XXX)
which allows you to get the table for any taxID.
from codon-usage-tables.
Great! It seems to be working now. I am going to try to integrate this into my project https://github.com/weitzner/codon_harmony Thanks!
from codon-usage-tables.
Hey it seems that your project could use DnaChisel, a generic DNA optimization library for Python which I am very proud of (I am surely biased!)
Here is how (some of) your project's specifications would be formulated using DnaChisel. Some specifications might not be exactly as you want them (in particular there has been some discussion around the codon harmonization) but the library is written so as to be easily extended by the user, so maybe it could work for you:
import dnachisel as dc
# GENERATE A RANDOM PROTEIN SEQUENCE FOR THE EXAMPLE
aa_sequence = dc.random_protein_sequence(1000)
dna_sequence = dc.reverse_translate(aa_sequence)
# SPECIFY THE CONSTRAINTS AND OBJECTIVES
problem = dc.DnaOptimizationProblem(
sequence=dna_sequence,
constraints=[
dc.EnforceTranslation(translation=aa_sequence), # keep the protein sequence
dc.EnforceGCContent(mini=0.3, maxi=0.7, window=70),
dc.AvoidHairpins(stem_size=10),
dc.AvoidPattern(dc.repeated_kmers(3, 3)),
dc.AvoidPattern(dc.repeated_kmers(9, 2)),
dc.AvoidPattern(enzyme='BsmBI'),
*(dc.AvoidPattern(dc.homopolymer_pattern(c, 6)) for c in "ATGC")
],
objectives=[
dc.CodonOptimize(species='e_coli')
]
)
# SOLVE THE CONSTRAINTS, THEN OPTIMIZE
print ("BEFORE:", problem.constraints_text_summary())
problem.resolve_constraints()
problem.optimize()
print ("AFTER:", problem.constraints_text_summary())
Output:
BEFORE: ===> FAILURE: 5 constraints evaluations failed
✔PASS ┍ EnforceTranslation[0-3000(+)]
│ All OK.
✔PASS ┍ EnforceGCContent[0-3000(+)](mini:0.30, maxi:0.70, window:70)
│ Passed !
✔PASS ┍ AvoidHairpins[0-3000(+)](stem_size:10, hairpin_window:200)
│ Score: 0. Locations: []
FAIL ┍ AvoidPattern[0-3000(+)](([ATGC]{3})\1{2} (3-repeats 3-mers))
│ Failed. Pattern found at positions [97-106(+), 98-107(+), 99-108(+),
│ 100-109(+), 172-181(+), 1453-1462(+), 1454-1463(+), 2967-2976(+)]
FAIL ┍ AvoidPattern[0-3000(+)](([ATGC]{9})\1{1} (2-repeats 9-mers))
│ Failed. Pattern found at positions [1420-1438(+)]
FAIL ┍ AvoidPattern[0-3000(+)](enzyme:BsmBI)
│ Failed. Pattern found at positions [1594-1600(-), 337-343(-)]
FAIL ┍ AvoidPattern[0-3000(+)](AAAAAA)
│ Failed. Pattern found at positions [790-796(+), 1226-1232(+),
│ 1963-1969(+), 1964-1970(+), 1965-1971(+), 2206-2212(+), 2207-2213(+),
│ 2810-2816(+)]
FAIL ┍ AvoidPattern[0-3000(+)](TTTTTT)
│ Failed. Pattern found at positions [2810-2816(-), 2207-2213(-),
│ 2206-2212(-), 1965-1971(-), 1964-1970(-), 1963-1969(-), 1226-1232(-),
│ 790-796(-)]
✔PASS ┍ AvoidPattern[0-3000(+)](GGGGGG)
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](CCCCCC)
│ Passed. Pattern not found !
AFTER: ===> SUCCESS - all constraints evaluations pass
✔PASS ┍ EnforceTranslation[0-3000(+)]
│ All OK.
✔PASS ┍ EnforceGCContent[0-3000(+)](mini:0.30, maxi:0.70, window:70)
│ Passed !
✔PASS ┍ AvoidHairpins[0-3000(+)](stem_size:10, hairpin_window:200)
│ Score: 0. Locations: []
✔PASS ┍ AvoidPattern[0-3000(+)](([ATGC]{3})\1{2} (3-repeats 3-mers))
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](([ATGC]{9})\1{1} (2-repeats 9-mers))
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](enzyme:BsmBI)
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](AAAAAA)
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](TTTTTT)
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](GGGGGG)
│ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-3000(+)](CCCCCC)
│ Passed. Pattern not found !
from codon-usage-tables.
Wow, that seems to be very close to what I could directly. Can a dc.DnaOptimizationProblem
contain multiple dc.EnforceGCContent
s? I have a particular way of defining "harmony" as well as a strategy for determining which codons to enrich and. deplete. I'd be interested to serif we could merge the strategies somehow. Let me know if that would be of interest to you!
from codon-usage-tables.
Yes you can put different GC contents for instance:
constraints = [
EnforceGCContent(mini=0.4, maxi=0.6), # global GC
EnforceGCContent(mini=0.7, maxi=0.3, window=100), # windowed GC
EnforceGCContent(mini=0.9, maxi=0.2, window=30), # smaller-windowed GC
]
Regarding the codon harmonization I would definitely be interested in whether/how your codon optimization can be ported into a Specification class. Most DnaChisel specs implement a strategy in which they "scan" the sequence, spot "underoptimal" regions, and optimize these locally, one after another, from left to right. But a new Specification class can also define its own resolution strategy and you are not obliged to follow this pattern. Could you describe briefly how your harmonization score is computed and how it is optimized?
from codon-usage-tables.
First a few tolerances are set – usage frequency below which a codon will be excluded from the set (currently defaults to 0.10, so if a codon is used < 10%. of the time, it is not considered here), and a maximum allowed deviation from the host profile (1 + relax
in the other package).
To compute the idea codon usage, the codon usage tables are updated (rare codons are dropped, frequencies are recomputed), and then, using the AA sequence, the desired use of each codon (as integers) is calculated. After this, the current DNA sequence is scanned with each codon's position(s) and count recorded, and the residual of the expected usage vs observed usage is computed. And then, basically, you just go through the list of codons that are over-represented and replace them with those that are under-represented.
After all that, the codon adaptation index is computed, and the sequence that matches the host profile and doesn't have the undesirable features with the highest CAI is outputted to disk.
from codon-usage-tables.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from codon-usage-tables.