Code Monkey home page Code Monkey logo

somaticsignatures's Introduction

SomaticSignatures

The SomaticSignatures package identifies mutational signatures of single nucleotide variants (SNVs). It provides a infrastructure related to the methodology described in Nik-Zainal (2012, Cell), with flexibility in the matrix decomposition algorithms.

For details about usage, installation and methodology, please refer to the SomaticSignatures package page in the Bioconductor repository.

somaticsignatures's People

Contributors

juliangehring avatar

Stargazers

Mingyi (May) Xia avatar edcee avatar wook2014 avatar  avatar bioinfo_100kownboy avatar zmiimz avatar Sophia Sue avatar  avatar  avatar Jedidiah Carlson avatar Wu,Yang avatar  avatar David Mas-Ponte avatar Trevor Tanner avatar  avatar Sasha Gusev avatar Philipp Münch avatar Bjørn Fjukstad avatar Alexander Goncearenco avatar Yuichi Shiraishi avatar Clint Valentine avatar Kenneth Daily avatar

Watchers

 avatar Alistair Chalk avatar Philipp Münch avatar

somaticsignatures's Issues

assessNumberSignatures function

Hi,

when I used

gof_nmf = assessNumberSignatures(sca_mm, n_sigs, nReplicates = 5),

I got errors

Error in (function (x) :
unused arguments (model = list(model = "NMFstd", rank = 2, target = 0), method = "random")

Would you please help me?

my sessionInfo()

R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Fedora 20 (Heisenbug)

Matrix products: default
BLAS: /home/users/xu/local/lib64/R/lib/libRblas.so
LAPACK: /home/users/xu/local/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] BSgenome.Hsapiens.1000genomes.hs37d5_0.99.1
 [2] BSgenome_1.44.0
 [3] rtracklayer_1.36.2
 [4] ggplot2_2.2.1
 [5] SomaticCancerAlterations_1.12.0
 [6] SomaticSignatures_2.12.0
 [7] VariantAnnotation_1.22.0
 [8] Rsamtools_1.28.0
 [9] Biostrings_2.44.0
[10] XVector_0.16.0
[11] SummarizedExperiment_1.6.1
[12] DelayedArray_0.2.4
[13] matrixStats_0.52.2
[14] Biobase_2.36.2
[15] GenomicRanges_1.28.2
[16] GenomeInfoDb_1.12.0
[17] IRanges_2.10.1
[18] S4Vectors_0.14.1
[19] BiocGenerics_0.22.0

loaded via a namespace (and not attached):
 [1] ProtGenerics_1.8.0            bitops_1.0-6
 [3] doParallel_1.0.10             RColorBrewer_1.1-2
 [5] httr_1.2.1                    tools_3.4.0
 [7] backports_1.0.5               R6_2.2.1
 [9] rpart_4.1-11                  Hmisc_4.0-3
[11] DBI_0.6-1                     lazyeval_0.2.0
[13] colorspace_1.3-2              nnet_7.3-12
[15] gridExtra_2.2.1               GGally_1.3.0
[17] compiler_3.4.0                graph_1.54.0
[19] htmlTable_1.9                 exomeCopy_1.22.0
[21] pkgmaker_0.22                 ggbio_1.24.0
[23] scales_0.4.1                  checkmate_1.8.2
[25] NMF_0.20.6                    proxy_0.4-17
[27] RBGL_1.52.0                   stringr_1.2.0
[29] digest_0.6.12                 foreign_0.8-68
[31] base64enc_0.1-3               dichromat_2.0-0
[33] htmltools_0.3.6               ensembldb_2.0.1
[35] htmlwidgets_0.8               rlang_0.1.1
[37] RSQLite_1.1-2                 BiocInstaller_1.26.0
[39] shiny_1.0.3                   BiocParallel_1.10.1
[41] acepack_1.4.1                 RCurl_1.95-4.8
[43] magrittr_1.5                  GenomeInfoDbData_0.99.0
[45] Formula_1.2-1                 Matrix_1.2-10
[47] Rcpp_0.12.11                  munsell_0.4.3
[49] stringi_1.1.5                 yaml_2.1.14
[51] zlibbioc_1.22.0               plyr_1.8.4
[53] AnnotationHub_2.8.1           grid_3.4.0
[55] lattice_0.20-35               splines_3.4.0
[57] GenomicFeatures_1.28.0        knitr_1.16
[59] rngtools_1.2.4                reshape2_1.4.2
[61] codetools_0.2-15              biomaRt_2.32.0
[63] XML_3.98-1.7                  biovizBase_1.24.0
[65] latticeExtra_0.6-28           pcaMethods_1.68.0
[67] data.table_1.10.4             httpuv_1.3.3
[69] foreach_1.4.3                 gtable_0.2.0
[71] reshape_0.8.6                 gridBase_0.4-7
[73] mime_0.5                      xtable_1.8-2
[75] AnnotationFilter_1.0.0        survival_2.41-3
[77] tibble_1.3.1                  OrganismDbi_1.18.0
[79] iterators_1.0.8               GenomicAlignments_1.12.1
[81] AnnotationDbi_1.38.0          registry_0.3
[83] memoise_1.1.0                 cluster_2.0.6
[85] interactiveDisplayBase_1.14.0

Question regarding Normalization

Hello,

The tool is really interersting. I have a question. I want to normalize my whole exome samples to the genome level. I used KmerFrequency of your package. I followed the same example as listed in the vignette. But I am not sure what value to add for n. As the default seems to be n=1e4. Can you please tell how can I decide the "n" value ?

How to compute matrix H for known signatures?

Hi,

Thanks for a great package!

As I read in many articles, there are several databases with already identified mutational signatures (for example, COSMIC database, http://cancer.sanger.ac.uk/cosmic/signatures).
And how can I compute a matrix H (contribution of each signature to the alterations present in each sample) with already specified matrices M (motif matrix) and W (signatures matrix)?




And I think it would be a good improvement to add such method to the package SomaticSignatures.

Thanks for an advice,
Sergey Kazakov, ITMO University.

Question about normalisation part

Hi,

First of all thank you for creating a stable and useful software for mutation signature analysis. I spent more than a week dealing other softwares loading vcf file step. so thank you!

I would like to compare mutational signature of TF proximity mutations vs (genome - TF binding proximity regions). So I have to make a proper normalisation for the sequence context.

Can I solve my problem with 4.8 Extension: Normalization of Sequence Motif Frequencies. I am thinking about giving my TF binding areas as a bed input instead of hg19 gene exons like you have explained in the introduction part.

Any comment will be very useful.

Best regards,

Tunc.

Simple usage instruction

Would it be too much trouble to add some info here on github what I should expect your algorithm to do?

What does it use as input? I would like to just feed it VCF files.
What do I get as output?

about minimum number of mutations required for SomaticSignature

Dear Julian,

Thank you for creating SomaticSignature, it is helping me a lot through my
masters degree.

I have a question about the package. so in "Deciphering signature of
mutational processes operative in human cancer paper", they mention about a
minimum number of mutations required for accurate prediction. I see that
you have not mention about a limiting parameter about this subject. Is
there a number which I should be cautious ?

Right now, I am comparing DHS regions mutation signature versus rest of the
mutations' signature in prostate cancer. My number in one of the samples
seems not too high and there I am having concerns about If the signature
that I see is a false positive or a noise.

Any idea would be very helpful,

Best regards,

Tunc.

Making initial object for SomaticSignatures

Hi,

This is not a problem with this software rather my confusion

I have a list of .vcf files from WGS contains called SNV and indels. I have extracted information from them like below

> head(mutations)
  sampleID chr     pos ref mut
1    t_005   1  928006   G   A
2    t_005   1 1649842   G   T
3    t_005   1 2020408   G   A
4    t_005   1 2031677   T   A
5    t_005   1 2901878   G   A
6    t_005   1 3039338   G   A
> 

I want to make VRanges object from my vcf files or the information I extracted from .vcf files, because I need to find signature in my WGS samples, could you please help me?

Thanks a lot in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.