Code Monkey home page Code Monkey logo

Comments (3)

asmvernon avatar asmvernon commented on July 29, 2024

Hi Arianna,
I'm glad you're finding it useful!

The query_missingGenes_from_module returns a data frame where MISSING_GENES contains a list of the genes. There are a few options on how you can then retrieve these data and aggregate it for multiple modules depending on what you want to do with it afterward.

Here's some code with some options. Hope one of them help guide you to best achieve your goal:

library(Metqy)
# install.packages('reshape2')

data("data_example_KOnumbers_vector")
modules <- c('M00001', 'M00002')

module_missing_genes_str <- data.frame('MODULE' = modules,
                                   'MISSING_GENES' = '',
                                   stringsAsFactors = FALSE)

# Collect pairs of modules and each missing gene by constructnig the following structure:
#
#   MODULE  MISSING_GENE
#   M00001 K00001
#   M00001 K00002
#   M00001 K00003
#   M00002 K00004
module_missing_genes_pairs <- NULL
  

for(M in 1:length(modules)){
  OUT <- query_missingGenes_from_module(data_example_KOnumbers_vector,modules[M])
  these_genes_list <- OUT$MISSING_GENES
  # collapse list into string
  these_genes_str <- paste(these_genes_list, collapse=',')
  module_missing_genes_str$MISSING_GENES[M] <- these_genes_str
  
  # Append the combination of the module ID repeted as many times as there are missing genes 
  #                             combine rows together
  module_missing_genes_pairs <- rbind(module_missing_genes_pairs,
                                      # combine two list into pair-wise columns
                                      cbind(rep(modules, length(these_genes_list)), these_genes_list))
  
}
# Tiddy module_missing_genes_pairs up 
module_missing_genes_pairs <- data.frame(module_missing_genes_pairs, stringsAsFactors = FALSE)
names(module_missing_genes_pairs) <- c('MODULE', 'MISSING_GENES')

# GENERATE A MATRIX OF THE MODULES AND THE MISSING GENES WHERE '1' INDICATES THE GENE IS MISSING FOR THAT MODULE
module_missing_genes_pairs$Value <- 1
module_missing_genes_matrix <- reshape2::dcast(module_missing_genes_pairs,formula = MODULE~MISSING_GENES,value.var = "Value",drop = F,fill = 0)
# Use the module ID to name the rows
rownames(module_missing_genes_matrix) <- module_missing_genes_matrix[,1]
# Remove the colums with the module IDS to have a numeric matrix

module_missing_genes_matrix           <- module_missing_genes_matrix[,2:ncol(module_missing_genes_matrix)]

You can find the script here

Let me know if you need any further help!

Also, I'd be interested to know what you are using it for :)

All the best,
Andrea

from metqy.

arianccbasile avatar arianccbasile commented on July 29, 2024

It sounds super good! I will try it asap. I was using it for gapfilling of metabolic models but actually I abandoned this way and used an other tool at the very ending. Now this pipeline came to my mind because I am doing metagenome assembled genomes annotation and I am back to this script.

Random question, do you plan to update the db?

All the best,
Arianna

from metqy.

asmvernon avatar asmvernon commented on July 29, 2024

Hi Arianna,
Hope you find it useful!

I don't plan to update the db, but you can pass updated data if you have access to it :)
Please look at the documentation on how to do this by using the use_module_reference_table and use_genome_reference_table argument fields

Best,
A

from metqy.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.