cMonkey biclustering R scripts, version 1.1

 Thu Dec  8 18:19:55 PST 2005
 Authors: David J. Reiss, ISB    (c) 2005-2006
          Richard Bonneau, ISB
 http://halo.systemsbiology.org/cmonkey

This code is available for non-profit use only, free to be used and modified. If results derived using this method are published, please cite: Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks by David J Reiss, Nitin S Baliga, Richard Bonneau BMC Bioinformatics 2006, 7:280 (2 June 2006)

Currently supported organisms: halo, hpy, ecoli, yeast. All data and code are provided as R-Data binary "object" files. The source for the functions included in these files are visible using R (http://www.r-project.org) and may be easily updated/overridden. Installation instructions below have only been tested on UNIX-based OSes (Linux, OSX).

INSTALLATION / RUNNING:

Download the relevant organism's RData file and place in a directory called "data/"
Download the "cMonkey_v1.1_source.RData" and "cMonkey.R" files and place them in the current directory
Download (and compile if necessary) MEME and MAST (http://metameme.sdsc.edu/mhmm-download.html)
Create links to "meme" and "mast" from within the "progs/" directory.
Start R, and in the R environment:

 6. >   install.packages("brlr",repos="http://cran.r-project.org") ## install brlr pkg if not installed
 7. >   load("cMonkey_v1.1_source.RData")  load all cMonkey code
 8. >   run()   enter organism to run on

NOTES: A. Ignore all warning messages printed to the screen - only worry if it dies with an error. B. All output is placed in the output/ORGANISM/ directory, along with PDF files containing bicluster plots and statistics generated during the optimization. Also generated are RData dump files, for re-starting a stopped/killed session and for input to run the Inferelator (regulatory network inference algorithm). C. You may kill/restart a run at any time. To completely re-start a new job, delete the output/ORGANISM directory. D. At any time you may plot a bicluster (e.g. 2) by typing (in R, after running step 7 above):

    >   start.up(); load.latest(); plotCluster.motif( clusterStack[[2]] )

E. Or, to plot them all to a PDF, instead of the "plotCluster.motif()" function, type:

    >   plot.clusters( clusterStack )  PDF is placed in output/ORGANISM

ADVANCED USAGE: if you want to adjust the cMonkey parameters, between steps (7) and (8) above, insert:

 a. >   start.up(); detach( params )   ## enter the organism to run on
 b. *  change the parameter that you wish to modify in the "params" list; e.g.
       > params$kmax <- 100   ## sets the number of clusters to be learned
 c. >   attach( params )

OTHER ORGANISMS: if you wish to use cMonkey on your own data for another organism, you need to inspect the "global.data" data structure, and replace the sub-structures seen there. I will be happy to offer help/suggestions/advice.

PARALELLIZATION: Additional headaches arise if you want run cMonkey parallelized. I have not included the necessary ingredients in this distribution, but I will, if anyone expresses a desire to do so. The extra steps include:

installing pvm for your machine
setting up pvm-related environment variables
installing the rpvm and snow R libraries
loading the snow-utils code (not included here)
editing the test.opt.all.clusters() function to uncomment the snow-related functions

CONTACT for assistance: David J. Reiss, ISB -- [email protected]

djreiss / cmonkey1 Goto Github PK

cmonkey1's Introduction

cMonkey biclustering R scripts, version 1.1

cmonkey1's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent