Code Monkey home page Code Monkey logo

Comments (4)

Al-Murphy avatar Al-Murphy commented on September 13, 2024

Hey Wenjun,

I don't believe Bioconductor versions of dbSNP 156 have been created yet - @hpages may have more information but I know it took time to create dbSNP 155 so I'm not sure of the timeline for this. Sorry Hervé, do you have any thoughts on this?

Thanks,
Alan.

from mungesumstats.

hpages avatar hpages commented on September 13, 2024

Hi Alan, Wenjun,

It might take a while before I get to produce the SNPlocs.Hsapiens.dbSNP156.* packages. The approach I'm currently using for generating the SNPlocs packages has reached its limits and doesn't scale well with the ever increasing size of dbSNP. So it would need to be revisited e.g. by splitting the whole thing into smaller packages or by moving the data to AnnotationHub or both. It might take a while before I get to this.

In the mean time, if you really need SNPlocs.Hsapiens.dbSNP156.GRCh37 now, you can try to forge it by using the scripts provided in the SNPlocsForge package here. The package lacks documentation, sorry. The scripts for dbSNP156 are in inst/scripts/dbSNP156/. You first need to manually create the shell of the SNPlocs.Hsapiens.dbSNP156.GRCh37 package (use the SNPlocs.Hsapiens.dbSNP155.GRCh37 package as a template). Then run the following scripts in that order: download_json.sh, extract_snvs_from_RefSNP_json_files.sh, select_GRCh37_snvs.sh, build_GRCh37_OnDiskLongTable.sh.

Note that you'll need a powerful Linux machine to run these scripts (I used a machine with 80 logical cpus and 384 Gb of RAM to forge the SNPlocs.Hsapiens.dbSNP155.* packages, and the scripts took about 1 week for each package). You'll also need a lot of disk space (300 or 400 Gb or something like that).

Let me know if you decide to give it a try and I'll do my best to help.

Best,
H.

from mungesumstats.

Al-Murphy avatar Al-Murphy commented on September 13, 2024

Hey Herve,

Thanks very much for the explanation, this is not something I have time/resources to do right now but I do believe it's important to find a more manageable way to produces these packages with subsequent releases. I'll get in touch with any suggestions on how to do this in the future.

Cheers,
Alan.

from mungesumstats.

Al-Murphy avatar Al-Murphy commented on September 13, 2024

Let's leave this open for now since it has not been addressed in any meaningful way

from mungesumstats.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.