yixuan / rspectra Goto Github PK
View Code? Open in Web Editor NEWR Interface to the Spectra Library for Large Scale Eigenvalue and SVD Problems
Home Page: http://cran.r-project.org/package=RSpectra
R Interface to the Spectra Library for Large Scale Eigenvalue and SVD Problems
Home Page: http://cran.r-project.org/package=RSpectra
I'm getting this trying to install on Ubutnu 16.04.
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
/usr/lib/R/etc/Makeconf:168: recipe for target 'eigs_gen.o' failed
make: *** [eigs_gen.o] Error 4
ERROR: compilation failed for package ‘RSpectra’
* removing ‘/usr/local/lib/R/site-library/RSpectra’
Error in i.p(...) :
(converted from warning) installation of package ‘/tmp/RtmpWtVm0n/filee627acba27e/RSpectra_0.13-1.tar.gz’ had non-zero exit status
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
Execution halted
Is it possible to implement the function interface for svds
? Or is there a non-trivial way to use the eigs
function interface to calculate the SVD with a function? I believe one would need to provide two functions: A %*% x
and t(A) %*% x
. At least that is what the PROPACK library has.
The dsCMatrix
and dsRMatrix
classes are the only useful ones you don’t support yet. They’re like dgCMatrix
/dgRMatrix
except that only one triangle is stored and a uplo
∈ {'U'
, 'L'
} slot exists to signify which.
The dgRMatrix
docs say “These "..RMatrix" classes are currently still mostly unimplemented!”, but it still can’t hurt to implement a 2-line method for them.
PS: Thank you for the amazing package, I wish it had been there from the beginning!
Say I know the total variance of the matrix, and want to get 99% of that (let's call that T
).
Instead of specifying k
, would it be possible to implement a version of e.g. eigs_sym()
that stops when sum(eigs$values^2)
is larger than T
?
It seems the selection of eigenvalues and eigenvectors of 'eigs_sym' are unstable? (note that d = 2 below)
I cross-checked the result with the "eigen" function from the base package. The results there are stable. (so it's probably not a data issue.)
I attached the data for your information. It's in binary format; one can use the "load" function to read the file. Thanks.
big_matrix.zip
Sorry the reprex isn't more informative. Is anyone else having this issue?
devtools::install_github("yixuan/RSpectra")
#> Downloading GitHub repo yixuan/RSpectra@master
#>
#>
checking for file 'C:\Users\alex\AppData\Local\Temp\RtmpOiJbBt\remotes17ac66a9851\yixuan-RSpectra-1bd1337/DESCRIPTION' ...
checking for file 'C:\Users\alex\AppData\Local\Temp\RtmpOiJbBt\remotes17ac66a9851\yixuan-RSpectra-1bd1337/DESCRIPTION' ...
v checking for file 'C:\Users\alex\AppData\Local\Temp\RtmpOiJbBt\remotes17ac66a9851\yixuan-RSpectra-1bd1337/DESCRIPTION' (404ms)
#>
- preparing 'RSpectra':
#> checking DESCRIPTION meta-information ...
checking DESCRIPTION meta-information ...
v checking DESCRIPTION meta-information
#> - cleaning src
#>
- checking for LF line-endings in source and make files and shell scripts (507ms)
#>
- checking for empty or unneeded directories
#>
- building 'RSpectra_0.15-0.tar.gz'
#>
#>
#> Installing package into 'C:/Users/alex/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> Error in i.p(...): (converted from warning) installation of package 'C:/Users/alex/AppData/Local/Temp/RtmpOiJbBt/file17ac6ad46b7f/RSpectra_0.15-0.tar.gz' had non-zero exit status
Created on 2019-05-31 by the reprex package (v0.2.1)
Hi, I'm unable to install this software on my server. Error follows:
> install.packages('RSpectra', repos='http://cran.rstudio.com/')
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'http://cran.rstudio.com/src/contrib/RSpectra_0.12-0.tar.gz'
Content type 'application/x-gzip' length 426762 bytes (416 KB)
==================================================
downloaded 416 KB
* installing *source* package ‘RSpectra’ ...
** package ‘RSpectra’ successfully unpacked and MD5 sums checked
** libs
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include -I"/usr/local/lib/R/site- library/Rcpp/include" -I"/usr/lib/R/site-library/RcppEigen/include" -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FO RTIFY_SOURCE=2 -g -c eigs_gen.cpp -o eigs_gen.o
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
/usr/lib/R/etc/Makeconf:168: recipe for target 'eigs_gen.o' failed
make: *** [eigs_gen.o] Error 4
ERROR: compilation failed for package ‘RSpectra’
* removing ‘/usr/local/lib/R/site-library/RSpectra’
The downloaded source packages are in
‘/tmp/RtmplxeMJ3/downloaded_packages’
Warning message:
In install.packages("RSpectra", repos = "http://cran.rstudio.com/") :
installation of package ‘RSpectra’ had non-zero exit status
Session Info:
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rtweet_0.6.0 forcats_0.2.0 stringr_1.2.0 dplyr_0.7.4 purrr_0.2.4
[6] readr_1.1.1 tidyr_0.8.0 tibble_1.4.2 ggplot2_2.2.1 tidyverse_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.15 cellranger_1.1.0 pillar_1.2.0 compiler_3.4.3 plyr_1.8.4
[6] bindr_0.1 tools_3.4.3 lubridate_1.7.2 jsonlite_1.5 nlme_3.1-124
[11] gtable_0.2.0 lattice_0.20-33 pkgconfig_2.0.1 rlang_0.2.0 psych_1.7.5
[16] cli_1.0.0 rstudioapi_0.7 curl_2.6 yaml_2.1.14 parallel_3.4.3
[21] haven_1.1.1 bindrcpp_0.2 xml2_1.1.1 httr_1.3.1 hms_0.3
[26] grid_3.4.3 glue_1.2.0 R6_2.2.2 readxl_1.0.0 foreign_0.8-66
[31] modelr_0.1.1 reshape2_1.4.2 magrittr_1.5 scales_0.4.1 rvest_0.3.2
[36] assertthat_0.2.0 mnormt_1.5-5 colorspace_1.3-2 stringi_1.1.5 openssl_0.9.6
[41] lazyeval_0.2.0 munsell_0.4.3 broom_0.4.2 crayon_1.3.4
Running into the following during some simulations, possibly related to #1.
library(RSpectra)
M <- new(
"dgCMatrix",
i = c(
31L, 33L, 2L, 5L, 23L, 21L, 14L, 34L,
20L, 39L, 11L, 1L, 22L, 14L, 19L, 7L, 6L, 2L, 20L, 24L, 28L,
13L, 27L, 8L, 39L, 9L, 4L, 12L, 16L, 35L, 24L, 32L, 2L, 15L,
3L, 36L, 18L, 2L, 38L, 26L, 29L, 10L, 3L, 2L, 0L, 30L, 37L, 25L,
12L, 17L
), p = c(
0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L,
24L, 25L, 26L, 27L, 30L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L,
40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L
), Dim = c(
40L,
47L
), Dimnames = list(NULL, NULL), x = c(
1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1
), factors = list()
)
dim(M)
#> [1] 40 47
sum(M)
#> [1] 50
image(as.matrix(M))
M
#> 40 x 47 sparse Matrix of class "dgCMatrix"
#>
#> [1,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . .
#> [3,] . . 1 . . . . . . . . . . . . . . 1 . . . . . . . . . . . 1 . . . . 1 . .
#> [4,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . .
#> [5,] . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . .
#> [6,] . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [7,] . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . .
#> [8,] . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . .
#> [9,] . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . .
#> [10,] . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . .
#> [11,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [12,] . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [13,] . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . .
#> [14,] . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . .
#> [15,] . . . . . . 1 . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . .
#> [16,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . .
#> [17,] . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . .
#> [18,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [19,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . .
#> [20,] . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . .
#> [21,] . . . . . . . . 1 . . . . . . . . . 1 . . . . . . . . . . . . . . . . . .
#> [22,] . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [23,] . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . .
#> [24,] . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [25,] . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . 1 . . . . . . . .
#> [26,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [27,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
#> [28,] . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . .
#> [29,] . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . .
#> [30,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [31,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [32,] 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [33,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . .
#> [34,] . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [35,] . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [36,] . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . .
#> [37,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . .
#> [38,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [39,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .
#> [40,] . . . . . . . . . 1 . . . . . . . . . . . . . . 1 . . . . . . . . . . . .
#>
#> [1,] . . . . 1 . . . . .
#> [2,] . . . . . . . . . .
#> [3,] . . . 1 . . . . . .
#> [4,] . . 1 . . . . . . .
#> [5,] . . . . . . . . . .
#> [6,] . . . . . . . . . .
#> [7,] . . . . . . . . . .
#> [8,] . . . . . . . . . .
#> [9,] . . . . . . . . . .
#> [10,] . . . . . . . . . .
#> [11,] . 1 . . . . . . . .
#> [12,] . . . . . . . . . .
#> [13,] . . . . . . . . 1 .
#> [14,] . . . . . . . . . .
#> [15,] . . . . . . . . . .
#> [16,] . . . . . . . . . .
#> [17,] . . . . . . . . . .
#> [18,] . . . . . . . . . 1
#> [19,] . . . . . . . . . .
#> [20,] . . . . . . . . . .
#> [21,] . . . . . . . . . .
#> [22,] . . . . . . . . . .
#> [23,] . . . . . . . . . .
#> [24,] . . . . . . . . . .
#> [25,] . . . . . . . . . .
#> [26,] . . . . . . . 1 . .
#> [27,] . . . . . . . . . .
#> [28,] . . . . . . . . . .
#> [29,] . . . . . . . . . .
#> [30,] 1 . . . . . . . . .
#> [31,] . . . . . 1 . . . .
#> [32,] . . . . . . . . . .
#> [33,] . . . . . . . . . .
#> [34,] . . . . . . . . . .
#> [35,] . . . . . . . . . .
#> [36,] . . . . . . . . . .
#> [37,] . . . . . . . . . .
#> [38,] . . . . . . 1 . . .
#> [39,] . . . . . . . . . .
#> [40,] . . . . . . . . . .
svds(M, 2)
#> Error in fun(A, k, nu, nv, opts, mattype = "dgCMatrix"): TridiagEigen: eigen decomposition failed
Created on 2021-10-11 by the reprex package (v2.0.1.9000)
I get the above message when running
out <- svds(X,k = 2)
on a very large matrix X. It is hard to tell based on that message what is the problem because I do not get any more details. What are the possible issues? Should I try to increase the number of Lanzcos basis vectors, the convergence tolerance and/or the maximum number of iterations?
Thanks,
Peter
I stumbled across the following:
eigs
and eigs_sym
have trouble converging with the option which = "SA"
and other values to get the smallest eigenvalues. After reading #2 I made the observation that eigs_sym
converges much faster and more reliably, if instead of which sigma = -1e-5
is set.
I'm trying to play with some algorithms in the graph embedding literature (e.g. locality preserving projections) which need to solve the generalized eigenvalue. The Matlab 'eigs' function has this option, but I notice it's missing in RSpectra.
Basically, I have a matrix of thousands of samples and say 100,000 variables. Using your package make it easy and fast to get the 10 first singular values/vectors.
I have an algorithm that iteratively (3 or 4 times) discards 1000 variables and I have to recompute the svd each time.
In other words, I compute 4 times very similar singular vectors and I would like to know if it could be possible to use "warm starts" for the second, third and fourth iterations in order to make them way faster?
Thanks for you work.
Hello,
When trying to install "RSpectra" after moving from R-3.4.1 to R-3.5.1, the following error occurs:
/usr/lib64/R/library/RcppEigen/include/Eigen/src/SparseCore/SparseSelfAdjointView.h:517:6: internal compiler error: in predicate_mem_writes, at tree-if-conv.c:2252
void permute_symm_to_symm(const MatrixType& mat, SparseMatrix<typename MatrixType::Scalar,DstOrder,typename MatrixType::StorageIndex>& _dest, const typename MatrixType::StorageIndex* perm)
I am using gcc-7.3.0 under GNU/Linux, and this is my sessionInfo():
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Gentoo/Linux
Matrix products: default
BLAS: /usr/lib64/libopenblas_openmp_haswellp-r0.2.20.so
LAPACK: /usr/lib64/libreflapack.so.3.7.0
locale:
[1] LC_CTYPE=ca_AD.UTF8 LC_NUMERIC=C LC_TIME=ca_AD.UTF8
[4] LC_COLLATE=C LC_MONETARY=ca_AD.UTF8 LC_MESSAGES=ca_AD.UTF8
[7] LC_PAPER=ca_AD.UTF8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=ca_AD.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] colorout_1.2-0
loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1
Is it the compiler? The R version?, the BLAS/LAPACK libraries? Any clue as to where to go from here?
Thank you.
Hi!
RSpectra seems really cool. However, there doesn’t seem to be a way to make it respect set.seed()
(and therefore deterministic):
eigs.init()
without initialization vectorSimpleRandom
with seed 0RSpectra should either
runif(n) - .5
and use it in eigs.init()
orSimpleRandom
with sample(.Machine$integer.max, 1L)
orglobalenv()$.Random.seed
exists as int vector if the RNG has been used) to SimpleRandom
Hi! Was trying to install Quanteda and it complain that RSpectra is needed. When trying to install, these errors popped up:
`../inst/include/RMatOp/RealShift_matrix.h:31:60: required from here
/usr/lib/R/site-library/RcppEigen/include/Eigen/src/Core/CoreEvaluators.h:960:8: warning: ignoring attributes on template argument ‘Eigen::internal::packet_traits::type {aka __vector(2) double}’ [-Wignored-attributes]
g++ -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o RSpectra.so eigs_gen.o eigs_sym.o matops.o register_routines.o svds.o -lblas -lgfortran -lm -lquadmath -L/usr/lib/R/lib -lR
/usr/bin/x86_64-linux-gnu-ld: cannot find -lgfortran
collect2: error: ld returned 1 exit status
/usr/share/R/share/make/shlib.mk:6: recipe for target 'RSpectra.so' failed
make: *** [RSpectra.so] Error 1
ERROR: compilation failed for package ‘RSpectra’
The downloaded source packages are in
‘/tmp/RtmpI1Gl2k/downloaded_packages’`
May I know how to resolve this?
hi @yixuan I just installed RSpectra from the source code on CRAN, and I see the following warnings from gcc-7.5.0. I also get these warnings using gcc-8.4.0.
RSpectra-warnings.txt
There are a lot of warnings, here are just a few. They seem to be coming from RcppEigen so you may consider asking the RcppEigen (or Eigen) devs to fix these.
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/SSE/Complex.h:232:1: note: in expansion of macro ‘EIGEN_MAKE_CONJ_HELPER_CPLX_REAL’
EIGEN_MAKE_CONJ_HELPER_CPLX_REAL(Packet2cf,Packet4f)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/Default/ConjHelper.h:22:70: warning: ignoring attributes on template argument ‘Eigen::internal::Packet4f {aka __vector(4) float}’ [-Wignored-attributes]
template<> struct conj_helper<PACKET_CPLX, PACKET_REAL, false,false> { \
^
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/SSE/Complex.h:232:1: note: in expansion of macro ‘EIGEN_MAKE_CONJ_HELPER_CPLX_REAL’
EIGEN_MAKE_CONJ_HELPER_CPLX_REAL(Packet2cf,Packet4f)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/Default/ConjHelper.h:15:70: warning: ignoring attributes on template argument ‘Eigen::internal::Packet2d {aka __vector(2) double}’ [-Wignored-attributes]
template<> struct conj_helper<PACKET_REAL, PACKET_CPLX, false,false> { \
^
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/SSE/Complex.h:417:1: note: in expansion of macro ‘EIGEN_MAKE_CONJ_HELPER_CPLX_REAL’
EIGEN_MAKE_CONJ_HELPER_CPLX_REAL(Packet1cd,Packet2d)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/Default/ConjHelper.h:22:70: warning: ignoring attributes on template argument ‘Eigen::internal::Packet2d {aka __vector(2) double}’ [-Wignored-attributes]
template<> struct conj_helper<PACKET_CPLX, PACKET_REAL, false,false> { \
^
to be clear, RSpectra does successfully install, but typically it is good to edit your code to suppress such warnings.
When we use RSpectra::svds()
with opts = list(center = TRUE, scale = TRUE)
, then
the eigenvalues do not match the values returned by base::svd()
or irlba::irlba()
.
Please see the full example below for details.
library(Matrix)
library(RSpectra)
library(irlba)
For the sake of reproducibility, let’s use the iris data.
mat <- as.matrix(iris[,1:4])
n_pcs <- 3
Columns are 4 features and rows are 150 flowers.
dim(mat)
#> [1] 150 4
head(mat)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,] 5.1 3.5 1.4 0.2
#> [2,] 4.9 3.0 1.4 0.2
#> [3,] 4.7 3.2 1.3 0.2
#> [4,] 4.6 3.1 1.5 0.2
#> [5,] 5.0 3.6 1.4 0.2
#> [6,] 5.4 3.9 1.7 0.4
Let’s run SVD without scaling the features.
mat_svd <- base::svd(x = mat)
mat_svds <- RSpectra::svds(
A = mat,
k = n_pcs,
opts = list(center = FALSE, scale = FALSE)
)
mat_irlba <- irlba::irlba(
A = mat,
nv = n_pcs,
center = FALSE,
scale = FALSE
)
#> Warning in irlba::irlba(A = mat, nv = n_pcs, center = FALSE, scale = FALSE):
#> You're computing too large a percentage of total singular values, use a standard
#> svd instead.
All methods return the same values for u
mat_svd$u[1:5,1:n_pcs]
#> [,1] [,2] [,3]
#> [1,] -0.06161685 0.1296114 0.002138597
#> [2,] -0.05807094 0.1110198 0.070672387
#> [3,] -0.05676305 0.1179665 0.004342549
#> [4,] -0.05665344 0.1053081 0.005924672
#> [5,] -0.06123020 0.1310898 -0.031881095
mat_svds$u[1:5,]
#> [,1] [,2] [,3]
#> [1,] -0.06161685 0.1296114 0.002138597
#> [2,] -0.05807094 0.1110198 0.070672387
#> [3,] -0.05676305 0.1179665 0.004342549
#> [4,] -0.05665344 0.1053081 0.005924672
#> [5,] -0.06123020 0.1310898 -0.031881095
mat_irlba$u[1:5,]
#> [,1] [,2] [,3]
#> [1,] -0.06161685 0.1296114 0.002138597
#> [2,] -0.05807094 0.1110198 0.070672387
#> [3,] -0.05676305 0.1179665 0.004342549
#> [4,] -0.05665344 0.1053081 0.005924672
#> [5,] -0.06123020 0.1310898 -0.031881095
All methods return the same values for v
mat_svd$v[,1:3]
#> [,1] [,2] [,3]
#> [1,] -0.7511082 0.2841749 0.50215472
#> [2,] -0.3800862 0.5467445 -0.67524332
#> [3,] -0.5130089 -0.7086646 -0.05916621
#> [4,] -0.1679075 -0.3436708 -0.53701625
mat_svds$v
#> [,1] [,2] [,3]
#> [1,] -0.7511082 0.2841749 0.50215472
#> [2,] -0.3800862 0.5467445 -0.67524332
#> [3,] -0.5130089 -0.7086646 -0.05916621
#> [4,] -0.1679075 -0.3436708 -0.53701625
mat_irlba$v
#> [,1] [,2] [,3]
#> [1,] -0.7511082 0.2841749 0.50215472
#> [2,] -0.3800862 0.5467445 -0.67524332
#> [3,] -0.5130089 -0.7086646 -0.05916621
#> [4,] -0.1679075 -0.3436708 -0.53701625
All methods return the same values for d
mat_svd$d[1:n_pcs]
#> [1] 95.959914 17.761034 3.460931
mat_svds$d
#> [1] 95.959914 17.761034 3.460931
mat_irlba$d
#> [1] 95.959914 17.761034 3.460931
Let’s try again, but this time let’s center and scale each feature.
mat_scaled <- scale(mat)
head(mat_scaled)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,] -0.8976739 1.01560199 -1.335752 -1.311052
#> [2,] -1.1392005 -0.13153881 -1.335752 -1.311052
#> [3,] -1.3807271 0.32731751 -1.392399 -1.311052
#> [4,] -1.5014904 0.09788935 -1.279104 -1.311052
#> [5,] -1.0184372 1.24503015 -1.335752 -1.311052
#> [6,] -0.5353840 1.93331463 -1.165809 -1.048667
After scaling, the means should be equal to zero:
colMeans(mat)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 5.843333 3.057333 3.758000 1.199333
colMeans(mat_scaled)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> -4.480675e-16 2.035409e-16 -2.844947e-17 -3.714621e-17
And the standard deviation should be equal to one:
apply(mat, 2, sd)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 0.8280661 0.4358663 1.7652982 0.7622377
apply(mat_scaled, 2, sd)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1 1 1 1
Ok, let’s try SVD again:
mat_svd_scaled <- base::svd(x = mat_scaled)
mat_svds_scaled <- RSpectra::svds(
A = mat,
k = n_pcs,
opts = list(center = TRUE, scale = TRUE)
)
mat_irlba_scaled <- irlba::irlba(
A = mat,
nv = n_pcs,
center = colMeans(mat),
scale = apply(mat, 2, sd)
)
#> Warning in irlba::irlba(A = mat, nv = n_pcs, center = colMeans(mat), scale =
#> apply(mat, : You're computing too large a percentage of total singular values,
#> use a standard svd instead.
All methods return the same values for u
mat_svd_scaled$u[1:5,1:n_pcs]
#> [,1] [,2] [,3]
#> [1,] -0.10823953 -0.04099580 0.027218646
#> [2,] -0.09945776 0.05757315 0.050003401
#> [3,] -0.11299630 0.02920003 -0.009420891
#> [4,] -0.10989710 0.05101939 -0.019457133
#> [5,] -0.11422046 -0.05524180 -0.003354363
mat_svds_scaled$u[1:5,]
#> [,1] [,2] [,3]
#> [1,] 0.10823953 0.04099580 -0.027218646
#> [2,] 0.09945776 -0.05757315 -0.050003401
#> [3,] 0.11299630 -0.02920003 0.009420891
#> [4,] 0.10989710 -0.05101939 0.019457133
#> [5,] 0.11422046 0.05524180 0.003354363
mat_irlba_scaled$u[1:5,]
#> [,1] [,2] [,3]
#> [1,] -0.10823953 -0.04099580 0.027218646
#> [2,] -0.09945776 0.05757315 0.050003401
#> [3,] -0.11299630 0.02920003 -0.009420891
#> [4,] -0.10989710 0.05101939 -0.019457133
#> [5,] -0.11422046 -0.05524180 -0.003354363
All methods return the same values for v
mat_svd_scaled$v[,1:3]
#> [,1] [,2] [,3]
#> [1,] 0.5210659 -0.37741762 0.7195664
#> [2,] -0.2693474 -0.92329566 -0.2443818
#> [3,] 0.5804131 -0.02449161 -0.1421264
#> [4,] 0.5648565 -0.06694199 -0.6342727
mat_svds_scaled$v
#> [,1] [,2] [,3]
#> [1,] -0.5210659 0.37741762 -0.7195664
#> [2,] 0.2693474 0.92329566 0.2443818
#> [3,] -0.5804131 0.02449161 0.1421264
#> [4,] -0.5648565 0.06694199 0.6342727
mat_irlba_scaled$v
#> [,1] [,2] [,3]
#> [1,] 0.5210659 -0.37741762 0.7195664
#> [2,] -0.2693474 -0.92329566 -0.2443818
#> [3,] 0.5804131 -0.02449161 -0.1421264
#> [4,] 0.5648565 -0.06694199 -0.6342727
Uh oh! It looks like svds()
is not returning the correct values for d
mat_svd_scaled$d[1:n_pcs]
#> [1] 20.853205 11.670070 4.676192
mat_svds_scaled$d
#> [1] 1.7083611 0.9560494 0.3830886
mat_irlba_scaled$d
#> [1] 20.853205 11.670070 4.676192
Let’s try RSpectra again, but we’ll do the scaling manually.
mat_svds_man_scaled <- RSpectra::svds(
A = mat_scaled,
k = n_pcs,
opts = list(center = FALSE, scale = FALSE)
)
mat_svds_man_scaled$d
#> [1] 20.853205 11.670070 4.676192
Manual scaling eliminates the error in the svds()
output. This
implies that there might be a bug inside the svds()
function.
The wrong values are off by a multiplicative constant factor,
but the value of the factor is dependent on the input data.
For the iris data, the factor seems to be 12.20656, which is close
to the number of matrix multiplication operations (11)
mat_svds_man_scaled$d / mat_svds_scaled$d
#> [1] 12.20656 12.20656 12.20656
mat_svds_scaled$nops
#> [1] 11
Created on 2022-01-25 by the reprex package (v2.0.1)
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.3 (2020-10-10)
#> os macOS Catalina 10.15.7
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2022-01-25
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.2)
#> cli 3.0.1 2021-07-17 [1] CRAN (R 4.0.2)
#> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2)
#> digest 0.6.28 2021-09-23 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [2] CRAN (R 4.0.1)
#> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.0.2)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [2] CRAN (R 4.0.2)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.0.2)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.0.2)
#> irlba * 2.3.3 2019-02-05 [2] CRAN (R 4.0.2)
#> knitr 1.36 2021-09-29 [1] CRAN (R 4.0.2)
#> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.0.2)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.0.2)
#> magrittr 2.0.1.9000 2020-12-15 [1] Github (tidyverse/magrittr@bb1c86a)
#> Matrix * 1.3-4 2021-06-01 [1] CRAN (R 4.0.2)
#> pillar 1.6.3 2021-09-26 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [2] CRAN (R 4.0.2)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.0.2)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.0.2)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.0.2)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.0.2)
#> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.0.2)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.0.2)
#> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.0.2)
#> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.0.2)
#> RSpectra * 0.16-0 2019-12-01 [2] CRAN (R 4.0.2)
#> rstudioapi 0.13 2020-11-12 [2] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [2] CRAN (R 4.0.2)
#> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [2] CRAN (R 4.0.2)
#> styler 1.6.2 2021-09-23 [1] CRAN (R 4.0.2)
#> tibble 3.1.5 2021-09-30 [1] CRAN (R 4.0.2)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.0.2)
#> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.0.2)
#> withr 2.4.2 2021-04-18 [1] CRAN (R 4.0.2)
#> xfun 0.26 2021-09-14 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [2] CRAN (R 4.0.2)
#>
#> [1] /Users/kamil/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
👋! I'm getting the installation error below for
sysname
"Linux"
release
"4.15.0-45-generic"
version
"#48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019"
Installing package into ‘/home/maelle/R/x86_64-pc-linux-gnu-library/3.4’
(as ‘lib’ is unspecified)
* installing *source* package ‘RSpectra’ ...
** libs
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include -I"/home/maelle/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include" -I"/home/maelle/R/x86_64-pc-linux-gnu-library/3.4/RcppEigen/include" -fpic -g -O2 -fdebug-prefix-map=/build/r-base-AitvI6/r-base-3.4.4=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c eigs_gen.cpp -o eigs_gen.o
eigs_gen.cpp:1:10: fatal error: RcppEigen.h: No such file or directory
#include <RcppEigen.h>
^~~~~~~~~~~~~
compilation terminated.
/usr/lib/R/etc/Makeconf:168: recipe for target 'eigs_gen.o' failed
make: *** [eigs_gen.o] Error 1
ERROR: compilation failed for package ‘RSpectra’
* removing ‘/home/maelle/R/x86_64-pc-linux-gnu-library/3.4/RSpectra’
Error in i.p(...) :
(converted from warning) installation of package ‘/tmp/RtmpHyJuEO/fileab463ae57e/RSpectra_0.13-1.tar.gz’ had non-zero exit status
I have a relatively large symmetric eigenvalue problem (ca. 12000x12000) and want the first couple of eigenvectors and values:
e <- eigs_sym(x, 6, opts = list(retvec = TRUE))
e$values
### [1] 68213.67 31349.46 24366.68 19671.44 17046.18 -102290.45
e <- eigs_sym(x, 5, opts = list(retvec = TRUE))
e$values
### [1] 68213.67 31349.46 24366.68 19671.44 -102290.45
Am I right, that the entire procedure is stable, but the last eigenvalue is garbage? e$nconv
is 5 and 6 respectively.
Is the correct solution to compute one more eigenvector/value than necessary and throw the last one away?
Hi yixuan,
I noticed that all machines' cores will be used when I run svds on my large matrix. Under some conditions, we may want to control the cores used by a program. How can I make it?
Also, I am writing a program using the Spectra (C++). There seems no way to run it in parallel. But RSpectra can take advantages of multi cores. What's the difference?
Looking forward to your reply.
Thanks in advance.
Dear Yixuan Qiu,
In the beginning of 2018, I installed the RSpectra 0.12-0 on my PC for using the SVDS function on mean-centered Mid Infrared spectra (X) to calculate the orthogonal signal (Wold et al., 1998).
e.g.: s <- RSpectra::svds(X,1,1,1)
In the meanwhile, I bought a new laptop and I was running the same code on the same dataset, giving totally different results. After a few hours of debugging, I found that the different results were caused by the version of the RSpectra package (RSpectra 0.13-1 was installed on the new laptop). After removing the RSpectra 0.13-1 and installing the RSpectra 0.12-0 on the laptop, I again obtained the same results as the ones I got before on the PC... But now I am wondering which version to use...
What has changed in de svds function in version 0.13-1 compared to 0.12-0? And which version is most suitable for my application?
BR,
Ben Aernouts
[email protected]
Reference: Wold, S., H. Antti, F. Lindgren, and J. Öhman. 1998. Orthogonal signal correction of near-infrared spectra. Chemom. Intell. Lab. Syst. 44:175–185. doi:10.1016/S0169-7439(98)00109-9.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.