Code Monkey home page Code Monkey logo

mlrepo's Introduction

Wiki Page

Visit the wiki page describing all MLRepo learning tasks.

Available Tasks

Download a single file containing all available tasks

Bacteremia

Diet

Antibiotics

Age

IBD

Gender

Vaginal

Geography

Body Habitat

Cancer

Obesity

Diabetes

Cirrhosis


Interested in adding a new dataset or task? See our instructions

If you use this package, make sure to cite at

@article{vangay2019microbiome,
  title={Microbiome Learning Repo (ML Repo): A public repository of microbiome regression and classification tasks},
  author={Vangay, Pajau and Hillmann, Benjamin M and Knights, Dan},
  journal={Gigascience},
  volume={8},
  number={5},
  pages={giz042},
  year={2019},
  publisher={Oxford University Press}
}

mlrepo's People

Contributors

danknights avatar mortonjt avatar pvangay avatar smdabdoub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mlrepo's Issues

details of OTU names?

Hi developer,

I am interested in the dataset in this repo. However, I can not find the corresponding taxonomy of each OTU.
For example, the bacteremia dataset has 1852 OTUs in otutable.txt, but with 228 taxonomy taxatable.txt, and there is no key column between the two tables.
May I know where can I have these data?

Thanks in advance!
-Ivy

OTU representative sequences for the Ravel task

Hi,

Thank you for this excellent resource!

Are the representative sequences for each OTU available for the Ravel set of tasks? The processed sequences that are provided in the fastq file do not seem to correspond to any of the OTUs as they have names like

>SRR063057_0 SRR063057.1 FU604JZ01CTHV2 length=228

which correspond to the sample IDs (SRR063057 is a sample ID).

Thank you in advance!

How to use the result of collapase-by-correlation?

Hello,
I am interested in using the cluster.by.correlation and collapse.by.correlation functions in MLRepo/example/lib/collapse-features.r to decrease the number of features (variables) to be fed into a PCA or cluster analysis. I was able to run the functions, and it returned the group assignment (variable 1 belongs to group 1, variable 2 belongs to group 2, ... etc.)

How can I use the group assignment in further analysis? Would it return a new dataset with collapsed variables (i.e. highly correlated variables are thinned into one)? Or do I pick one variable myself to be fed into downstream analysis, like I pick variable 1 out of variables 1, 5, 9 in group 2, for example?

I hope it makes sense? Looking forward to hearing from you.
Rie

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.