Code Monkey home page Code Monkey logo

gdc-tcga-file-fetcher's Introduction

File UUID fetcher for GDC/TCGA

These two scripts together will fetch all the cases in the GDC which have Primary Solid Tumor and Solid Tissue Normal sample sequenced.

You can put in filters for type of sequencing and type of tumor.

Detailed instructions of how to use these scripts:

Try to be on an educational network such as eduroam. Although I'm not certain, but I've seen significant delays in API response on private networks; AT&T in my case.

Step 1: Set the data_path variable to the location of your choice. It's immediately after this comment section.

Step 2: See what all sequencing strategies you want, and set the strategies variable accordingly.

Step 3: Put in primary sites in primary_sites variable. Some exaples include: Kidney, Ovary, Brain etc.

Step 4: In order to get complete intersection of files, put if_files as FALSE. Run this script once.

Step 5: Then switch back the if_files flag to TRUE and run this script once again. This should create some json files in your data directory. Please check.

Step 6: Run the "gather_gdc_uuid_links.py" script which will create some links file.

Note: These link files have the cases which were sequenced for both normal and tumor samples.

Hope this helps! Thanks.

gdc-tcga-file-fetcher's People

Contributors

shivamsharma13 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.