Code Monkey home page Code Monkey logo

Comments (9)

wir963 avatar wir963 commented on August 29, 2024

Hey @WuyangFF95 ,

I'm currently working on a demo, which is on #6. I think the demo directory on that branch should answer your question. I'd appreciate any feedback on the demo to make it more user-friendly.

Best, Welles

from tcsm.

WuyangFF95 avatar WuyangFF95 commented on August 29, 2024

I cannot initiate the run_stm.R in Anaconda environment.

(tcsm) [wuyang@monster tcsm]$ Rscript src/run_stm.R -h
stm v1.3.3 (2018-1-26) successfully loaded. See ?stm for help.
Papers, resources, and other materials at structuraltopicmodel.com
Error: object 'snakemake' not found
Execution halted

from tcsm.

WuyangFF95 avatar WuyangFF95 commented on August 29, 2024

Another question, can I change the seed number before running? Thanks!

from tcsm.

wir963 avatar wir963 commented on August 29, 2024

@WuyangFF95 Are you on the demo branch? You will need to update the environment file to include argparse conda install -c conda-forge r-argparse but it seems like you're still on the master branch.

Yep, you can use whatever seed you want. The seed is for being able to reproduce our results from the paper

from tcsm.

WuyangFF95 avatar WuyangFF95 commented on August 29, 2024

I managed to open the help message by switching to demo branch.
(tcsm) [wuyang@monster demo]$ Rscript ../src/run_stm.R -h

usage: ../src/run_stm.R [-h] [-m M] [-c C] [-e EXPOSURES]
[--signatures SIGNATURES] [--effect EFFECT]
[--sigma SIGMA] [--gamma GAMMA] [-s S]
[--covariates COVARIATES] [-k K]

optional arguments:
-h, --help show this help message and exit
-m M mutation count input file
-c C covariate input file
-e EXPOSURES, --exposures EXPOSURES
normalized exposure output file
--signatures SIGNATURES
exome signature output file
--effect EFFECT effect output file
--sigma SIGMA sigma output file
--gamma GAMMA gamma output file
-s S random seed
--covariates COVARIATES
covariates (separated by +)
-k K number of signatures to use

Here you stated "exome signature output file". So I just wonder is tcsm only good for analyzing exome mutation count file, rather than genome mutation count file?

Also, can you tell me what is the meaning of "covariates"?

from tcsm.

wir963 avatar wir963 commented on August 29, 2024

@WuyangFF95

Sorry for the delay. Early on in this research project, we experimented with normalizing signatures using nucleotide opportunity so we differentiated between exome and genome signatures for that reason. We just used TCSM for exome signatures in this paper. However, there's no reason why TCSM wouldn't work for genome as well as exome signatures. I updated the code so it now reads signature output file.

"covariates" are factors that may influence the prior expected exposure for a signature, like biallelic inactivation of BRCA1/2 and SBS3. This is the main idea behind the method so I'd suggest to check out the paper (https://academic.oup.com/bioinformatics/article/35/14/i492/5529117) for a more thorough discussion of covariates

from tcsm.

WuyangFF95 avatar WuyangFF95 commented on August 29, 2024

If covariate is not available, is it okay to omit it?

from tcsm.

wir963 avatar wir963 commented on August 29, 2024

@WuyangFF95

Sorry for the delay. It's okay to omit the covariate (but it may have been a little complicated from the code base so I simplified that in the above PR). The key advantage of TCSM compared to other models is the use of covariates though

from tcsm.

WuyangFF95 avatar WuyangFF95 commented on August 29, 2024

Great! I'll try to play with it with a specific K first.

from tcsm.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.