Code Monkey home page Code Monkey logo

Comments (4)

Zulko avatar Zulko commented on July 21, 2024

I hit this problem too yesterday, and I added an optional "timeout" in my local version of the python library, so it won't just hang when connecting to Kazusa. I will push it online soon. I don't know of any mirror for kasuza, so if it proves unreliable the only long term workaround is to add more tables in the tables folder.

And thanks for the support, don't hesitate to reach out for anything related to our software (I'm always glad to learn it can be helpful outside of our foundry), or if you ever need a biofoundry - we're always happy to discuss projects. We're also hiring a computational person at the moment, in case you know anyone willing to move to Edinburgh (great environment, great city).

from codon-usage-tables.

simone-pignotti avatar simone-pignotti commented on July 21, 2024

Hi,

I have been experiencing problems with Kazusa too lately. If you are still considering the option of adding more tables to the data folder, I suggest to consider switching to Hive as source. The tables on there are based on many more CDSs wrt Kazusa, and the DB is regularly updated. Unfortunately it doesn't seem to offer programmatic access to its resources, but it should be fairly easy to parse the Refseq_CDS.tsv (or genbank_CDS.tsv) file from the page "Available Files to Download". Total size for RefSeq is 96MB, but seems to contain strain-level tables too.

I am available for helping on this if you like the idea. I would cluster at the species level and add tables with the average frequencies from their clades in the data folder. If it turns out to be too big for distribution, there could be an option to download and build it at the first run of the library.

Simone

from codon-usage-tables.

Zulko avatar Zulko commented on July 21, 2024

Hello,

If Kazusa keeps giving you problems, I agree that having more tables in the repo (potentially from Kazusa) would be a good idea.

The one thing I worry about with adding Hive as a source is that it may confuse users as to which database (of Kasuza or Hive) they are using. Plus I am not sure how the Washington Uni would react to us hosting a big chunk of their data.

I see two possible approaches:

  • Add Hive tables to this project, separated from the Kasuza ones, and let users determine which database they want to use via a global setting in their python script. If the hive codon tables are big (say, more than 10Mo), their download should be a pip install option.
  • Or fork this project under "hive codon tables" and make a hive-specific repo (maybe with the same APIs as this one, so we can easily switch in projects).

Two final remarks:

  • I am not at the EGF anymore (@veghp is taking over the software projects of the foundry) and I am not sure how much I'll be able to help beyond commenting on questions and PRs (I'll find out soon).
  • There is a Synbiopython project which should get support for codon tables soon. Tagging @neilswainston in the discussion for awareness.

from codon-usage-tables.

neilswainston avatar neilswainston commented on July 21, 2024

from codon-usage-tables.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.