yixuan / rspectra Goto Github PK

View Code? Open in Web Editor NEW

79.0 9.0 12.0 296 KB

R Interface to the Spectra Library for Large Scale Eigenvalue and SVD Problems

Home Page: http://cran.r-project.org/package=RSpectra

R 16.44% C 2.30% C++ 81.26%

eigenvalues spectra svd

rspectra's People

Stargazers

Watchers

Forkers

oldregan hal2001 minghao2016 solertis privefl strategist922 mhamine flying-sheep sunbjt pgsrv taiyun

rspectra's Issues

Ubuntu 16.04 Intallation

I'm getting this trying to install on Ubutnu 16.04.

g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
/usr/lib/R/etc/Makeconf:168: recipe for target 'eigs_gen.o' failed
make: *** [eigs_gen.o] Error 4
ERROR: compilation failed for package ‘RSpectra’
* removing ‘/usr/local/lib/R/site-library/RSpectra’
Error in i.p(...) : 
  (converted from warning) installation of package ‘/tmp/RtmpWtVm0n/filee627acba27e/RSpectra_0.13-1.tar.gz’ had non-zero exit status
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
Execution halted

Is it possible to implement the function interface for svds? Or is there a non-trivial way to use the eigs function interface to calculate the SVD with a function? I believe one would need to provide two functions: A %*% x and t(A) %*% x. At least that is what the PROPACK library has.

Support dsCMatrix/dsRMatrix

The dsCMatrix and dsRMatrix classes are the only useful ones you don’t support yet. They’re like dgCMatrix/dgRMatrix except that only one triangle is stored and a uplo ∈ {'U', 'L'} slot exists to signify which.

The dgRMatrix docs say “These "..RMatrix" classes are currently still mostly unimplemented!”, but it still can’t hurt to implement a 2-line method for them.

PS: Thank you for the amazing package, I wish it had been there from the beginning!

Choose the number of components dynamically

Say I know the total variance of the matrix, and want to get 99% of that (let's call that T).

Instead of specifying k, would it be possible to implement a version of e.g. eigs_sym() that stops when sum(eigs$values^2) is larger than T?

Unstable selection of eigenvalues and eigenvectors?

It seems the selection of eigenvalues and eigenvectors of 'eigs_sym' are unstable? (note that d = 2 below)

I cross-checked the result with the "eigen" function from the base package. The results there are stable. (so it's probably not a data issue.)

I attached the data for your information. It's in binary format; one can use the "load" function to read the file. Thanks.
big_matrix.zip

Unable to install dev version

Sorry the reprex isn't more informative. Is anyone else having this issue?

devtools::install_github("yixuan/RSpectra")
#> Downloading GitHub repo yixuan/RSpectra@master
#> 
#>   
  
  
   checking for file 'C:\Users\alex\AppData\Local\Temp\RtmpOiJbBt\remotes17ac66a9851\yixuan-RSpectra-1bd1337/DESCRIPTION' ...
  
   checking for file 'C:\Users\alex\AppData\Local\Temp\RtmpOiJbBt\remotes17ac66a9851\yixuan-RSpectra-1bd1337/DESCRIPTION' ... 
  
v  checking for file 'C:\Users\alex\AppData\Local\Temp\RtmpOiJbBt\remotes17ac66a9851\yixuan-RSpectra-1bd1337/DESCRIPTION' (404ms)
#> 
  
  
  
-  preparing 'RSpectra':
#>    checking DESCRIPTION meta-information ...
  
   checking DESCRIPTION meta-information ... 
  
v  checking DESCRIPTION meta-information
#> -  cleaning src
#> 
  
  
  
-  checking for LF line-endings in source and make files and shell scripts (507ms)
#> 
  
  
  
-  checking for empty or unneeded directories
#> 
  
  
  
-  building 'RSpectra_0.15-0.tar.gz'
#> 
  
   
#> 
#> Installing package into 'C:/Users/alex/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> Error in i.p(...): (converted from warning) installation of package 'C:/Users/alex/AppData/Local/Temp/RtmpOiJbBt/file17ac6ad46b7f/RSpectra_0.15-0.tar.gz' had non-zero exit status

^{Created on 2019-05-31 by the reprex package (v0.2.1)}

Installation error Ubuntu 16.04

Hi, I'm unable to install this software on my server. Error follows:

> install.packages('RSpectra', repos='http://cran.rstudio.com/')
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'http://cran.rstudio.com/src/contrib/RSpectra_0.12-0.tar.gz'
Content type 'application/x-gzip' length 426762 bytes (416 KB)
==================================================
downloaded 416 KB

* installing *source* package ‘RSpectra’ ...
** package ‘RSpectra’ successfully unpacked and MD5 sums checked
** libs
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include -I"/usr/local/lib/R/site-                                     library/Rcpp/include" -I"/usr/lib/R/site-library/RcppEigen/include"    -fpic  -g                                      -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FO                                     RTIFY_SOURCE=2 -g  -c eigs_gen.cpp -o eigs_gen.o
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
/usr/lib/R/etc/Makeconf:168: recipe for target 'eigs_gen.o' failed
make: *** [eigs_gen.o] Error 4
ERROR: compilation failed for package ‘RSpectra’
* removing ‘/usr/local/lib/R/site-library/RSpectra’

The downloaded source packages are in
        ‘/tmp/RtmplxeMJ3/downloaded_packages’
Warning message:
In install.packages("RSpectra", repos = "http://cran.rstudio.com/") :
  installation of package ‘RSpectra’ had non-zero exit status

Session Info:

R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rtweet_0.6.0    forcats_0.2.0   stringr_1.2.0   dplyr_0.7.4     purrr_0.2.4    
 [6] readr_1.1.1     tidyr_0.8.0     tibble_1.4.2    ggplot2_2.2.1   tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15     cellranger_1.1.0 pillar_1.2.0     compiler_3.4.3   plyr_1.8.4      
 [6] bindr_0.1        tools_3.4.3      lubridate_1.7.2  jsonlite_1.5     nlme_3.1-124    
[11] gtable_0.2.0     lattice_0.20-33  pkgconfig_2.0.1  rlang_0.2.0      psych_1.7.5     
[16] cli_1.0.0        rstudioapi_0.7   curl_2.6         yaml_2.1.14      parallel_3.4.3  
[21] haven_1.1.1      bindrcpp_0.2     xml2_1.1.1       httr_1.3.1       hms_0.3         
[26] grid_3.4.3       glue_1.2.0       R6_2.2.2         readxl_1.0.0     foreign_0.8-66  
[31] modelr_0.1.1     reshape2_1.4.2   magrittr_1.5     scales_0.4.1     rvest_0.3.2     
[36] assertthat_0.2.0 mnormt_1.5-5     colorspace_1.3-2 stringi_1.1.5    openssl_0.9.6   
[41] lazyeval_0.2.0   munsell_0.4.3    broom_0.4.2      crayon_1.3.4

Error: TridiagEigen: eigen decomposition failed

Running into the following during some simulations, possibly related to #1.

library(RSpectra)

M <- new(
  "dgCMatrix",
  i = c(
    31L, 33L, 2L, 5L, 23L, 21L, 14L, 34L,
    20L, 39L, 11L, 1L, 22L, 14L, 19L, 7L, 6L, 2L, 20L, 24L, 28L,
    13L, 27L, 8L, 39L, 9L, 4L, 12L, 16L, 35L, 24L, 32L, 2L, 15L,
    3L, 36L, 18L, 2L, 38L, 26L, 29L, 10L, 3L, 2L, 0L, 30L, 37L, 25L,
    12L, 17L
  ), p = c(
    0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
    11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L,
    24L, 25L, 26L, 27L, 30L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L,
    40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L
  ), Dim = c(
    40L,
    47L
  ), Dimnames = list(NULL, NULL), x = c(
    1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1
  ), factors = list()
)

dim(M)
#> [1] 40 47
sum(M)
#> [1] 50
image(as.matrix(M))

M
#> 40 x 47 sparse Matrix of class "dgCMatrix"
#>                                                                                
#>  [1,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#>  [2,] . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . .
#>  [3,] . . 1 . . . . . . . . . . . . . . 1 . . . . . . . . . . . 1 . . . . 1 . .
#>  [4,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . .
#>  [5,] . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . .
#>  [6,] . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#>  [7,] . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . .
#>  [8,] . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . .
#>  [9,] . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . .
#> [10,] . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . .
#> [11,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [12,] . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [13,] . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . .
#> [14,] . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . .
#> [15,] . . . . . . 1 . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . .
#> [16,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . .
#> [17,] . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . .
#> [18,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [19,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . .
#> [20,] . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . .
#> [21,] . . . . . . . . 1 . . . . . . . . . 1 . . . . . . . . . . . . . . . . . .
#> [22,] . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [23,] . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . .
#> [24,] . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [25,] . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . 1 . . . . . . . .
#> [26,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [27,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
#> [28,] . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . .
#> [29,] . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . .
#> [30,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [31,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [32,] 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [33,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . .
#> [34,] . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [35,] . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [36,] . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . .
#> [37,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . .
#> [38,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [39,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .
#> [40,] . . . . . . . . . 1 . . . . . . . . . . . . . . 1 . . . . . . . . . . . .
#>                          
#>  [1,] . . . . 1 . . . . .
#>  [2,] . . . . . . . . . .
#>  [3,] . . . 1 . . . . . .
#>  [4,] . . 1 . . . . . . .
#>  [5,] . . . . . . . . . .
#>  [6,] . . . . . . . . . .
#>  [7,] . . . . . . . . . .
#>  [8,] . . . . . . . . . .
#>  [9,] . . . . . . . . . .
#> [10,] . . . . . . . . . .
#> [11,] . 1 . . . . . . . .
#> [12,] . . . . . . . . . .
#> [13,] . . . . . . . . 1 .
#> [14,] . . . . . . . . . .
#> [15,] . . . . . . . . . .
#> [16,] . . . . . . . . . .
#> [17,] . . . . . . . . . .
#> [18,] . . . . . . . . . 1
#> [19,] . . . . . . . . . .
#> [20,] . . . . . . . . . .
#> [21,] . . . . . . . . . .
#> [22,] . . . . . . . . . .
#> [23,] . . . . . . . . . .
#> [24,] . . . . . . . . . .
#> [25,] . . . . . . . . . .
#> [26,] . . . . . . . 1 . .
#> [27,] . . . . . . . . . .
#> [28,] . . . . . . . . . .
#> [29,] . . . . . . . . . .
#> [30,] 1 . . . . . . . . .
#> [31,] . . . . . 1 . . . .
#> [32,] . . . . . . . . . .
#> [33,] . . . . . . . . . .
#> [34,] . . . . . . . . . .
#> [35,] . . . . . . . . . .
#> [36,] . . . . . . . . . .
#> [37,] . . . . . . . . . .
#> [38,] . . . . . . 1 . . .
#> [39,] . . . . . . . . . .
#> [40,] . . . . . . . . . .
svds(M, 2)
#> Error in fun(A, k, nu, nv, opts, mattype = "dgCMatrix"): TridiagEigen: eigen decomposition failed

^{Created on 2021-10-11 by the reprex package (v2.0.1.9000)}

Error: TridiagEigen: failed to compute all the eigenvalues

I get the above message when running

out <- svds(X,k = 2)

on a very large matrix X. It is hard to tell based on that message what is the problem because I do not get any more details. What are the possible issues? Should I try to increase the number of Lanzcos basis vectors, the convergence tolerance and/or the maximum number of iterations?

Thanks,
Peter

eigs and eigs_sym have trouble converging for which = "SA"

I stumbled across the following:
eigs and eigs_sym have trouble converging with the option which = "SA" and other values to get the smallest eigenvalues. After reading #2 I made the observation that eigs_sym converges much faster and more reliably, if instead of which sigma = -1e-5 is set.

generalized eigenvalues

I'm trying to play with some algorithms in the graph embedding literature (e.g. locality preserving projections) which need to solve the generalized eigenvalue. The Matlab 'eigs' function has this option, but I notice it's missing in RSpectra.

Warm start in svds()

Basically, I have a matrix of thousands of samples and say 100,000 variables. Using your package make it easy and fast to get the 10 first singular values/vectors.
I have an algorithm that iteratively (3 or 4 times) discards 1000 variables and I have to recompute the svd each time.
In other words, I compute 4 times very similar singular vectors and I would like to know if it could be possible to use "warm starts" for the second, third and fourth iterations in order to make them way faster?

Thanks for you work.

Error when installing: "internal compiler error: in predicate_mem_writes"

Hello,

When trying to install "RSpectra" after moving from R-3.4.1 to R-3.5.1, the following error occurs:

/usr/lib64/R/library/RcppEigen/include/Eigen/src/SparseCore/SparseSelfAdjointView.h:517:6: internal compiler error: in predicate_mem_writes, at tree-if-conv.c:2252
 void permute_symm_to_symm(const MatrixType& mat, SparseMatrix<typename MatrixType::Scalar,DstOrder,typename MatrixType::StorageIndex>& _dest, const typename MatrixType::StorageIndex* perm)

I am using gcc-7.3.0 under GNU/Linux, and this is my sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Gentoo/Linux
Matrix products: default
BLAS: /usr/lib64/libopenblas_openmp_haswellp-r0.2.20.so
LAPACK: /usr/lib64/libreflapack.so.3.7.0
locale:
 [1] LC_CTYPE=ca_AD.UTF8       LC_NUMERIC=C              LC_TIME=ca_AD.UTF8
 [4] LC_COLLATE=C              LC_MONETARY=ca_AD.UTF8    LC_MESSAGES=ca_AD.UTF8
 [7] LC_PAPER=ca_AD.UTF8       LC_NAME=C                 LC_ADDRESS=C
[10] LC_TELEPHONE=C            LC_MEASUREMENT=ca_AD.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
other attached packages:
[1] colorout_1.2-0
loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1

Is it the compiler? The R version?, the BLAS/LAPACK libraries? Any clue as to where to go from here?

Thank you.

Determinism

Hi!

RSpectra seems really cool. However, there doesn’t seem to be a way to make it respect set.seed() (and therefore deterministic):

The common code to get eigs calls eigs.init() without initialization vector
The parameterless init method initializes SimpleRandom with seed 0

RSpectra should either

generate an init vector from R via runif(n) - .5 and use it in eigs.init() or
initialize SimpleRandom with sample(.Machine$integer.max, 1L) or
somehow pass the state of the RNG (globalenv()$.Random.seed exists as int vector if the RNG has been used) to SimpleRandom

Installation error on Ubuntu 18.04

Hi! Was trying to install Quanteda and it complain that RSpectra is needed. When trying to install, these errors popped up:
`../inst/include/RMatOp/RealShift_matrix.h:31:60: required from here
/usr/lib/R/site-library/RcppEigen/include/Eigen/src/Core/CoreEvaluators.h:960:8: warning: ignoring attributes on template argument ‘Eigen::internal::packet_traits::type {aka __vector(2) double}’ [-Wignored-attributes]
g++ -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o RSpectra.so eigs_gen.o eigs_sym.o matops.o register_routines.o svds.o -lblas -lgfortran -lm -lquadmath -L/usr/lib/R/lib -lR
/usr/bin/x86_64-linux-gnu-ld: cannot find -lgfortran
collect2: error: ld returned 1 exit status
/usr/share/R/share/make/shlib.mk:6: recipe for target 'RSpectra.so' failed
make: *** [RSpectra.so] Error 1
ERROR: compilation failed for package ‘RSpectra’

removing ‘/home/chubuntu/R/x86_64-pc-linux-gnu-library/3.4/RSpectra’
Warning in install.packages :
installation of package ‘RSpectra’ had non-zero exit status

The downloaded source packages are in
‘/tmp/RtmpI1Gl2k/downloaded_packages’`

May I know how to resolve this?

compilation warnings on ubuntu with gcc

hi @yixuan I just installed RSpectra from the source code on CRAN, and I see the following warnings from gcc-7.5.0. I also get these warnings using gcc-8.4.0.
RSpectra-warnings.txt

There are a lot of warnings, here are just a few. They seem to be coming from RcppEigen so you may consider asking the RcppEigen (or Eigen) devs to fix these.

/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/SSE/Complex.h:232:1: note: in expansion of macro ‘EIGEN_MAKE_CONJ_HELPER_CPLX_REAL’
 EIGEN_MAKE_CONJ_HELPER_CPLX_REAL(Packet2cf,Packet4f)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/Default/ConjHelper.h:22:70: warning: ignoring attributes on template argument ‘Eigen::internal::Packet4f {aka __vector(4) float}’ [-Wignored-attributes]
   template<> struct conj_helper<PACKET_CPLX, PACKET_REAL, false,false> {                                          \
                                                                      ^
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/SSE/Complex.h:232:1: note: in expansion of macro ‘EIGEN_MAKE_CONJ_HELPER_CPLX_REAL’
 EIGEN_MAKE_CONJ_HELPER_CPLX_REAL(Packet2cf,Packet4f)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/Default/ConjHelper.h:15:70: warning: ignoring attributes on template argument ‘Eigen::internal::Packet2d {aka __vector(2) double}’ [-Wignored-attributes]
   template<> struct conj_helper<PACKET_REAL, PACKET_CPLX, false,false> {                                          \
                                                                      ^
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/SSE/Complex.h:417:1: note: in expansion of macro ‘EIGEN_MAKE_CONJ_HELPER_CPLX_REAL’
 EIGEN_MAKE_CONJ_HELPER_CPLX_REAL(Packet1cd,Packet2d)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tdhock/lib/R/library/RcppEigen/include/Eigen/src/Core/arch/Default/ConjHelper.h:22:70: warning: ignoring attributes on template argument ‘Eigen::internal::Packet2d {aka __vector(2) double}’ [-Wignored-attributes]
   template<> struct conj_helper<PACKET_CPLX, PACKET_REAL, false,false> {                                          \
                                                                      ^

to be clear, RSpectra does successfully install, but typically it is good to edit your code to suppress such warnings.

eigenvalues are incorrect with opts = list(center=TRUE scale=TRUE)

Summary

When we use RSpectra::svds() with opts = list(center = TRUE, scale = TRUE), then
the eigenvalues do not match the values returned by base::svd() or irlba::irlba().

Please see the full example below for details.

library(Matrix)
library(RSpectra)
library(irlba)

For the sake of reproducibility, let’s use the iris data.

mat <- as.matrix(iris[,1:4])
n_pcs <- 3

Columns are 4 features and rows are 150 flowers.

dim(mat)
#> [1] 150   4
head(mat)
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,]          5.1         3.5          1.4         0.2
#> [2,]          4.9         3.0          1.4         0.2
#> [3,]          4.7         3.2          1.3         0.2
#> [4,]          4.6         3.1          1.5         0.2
#> [5,]          5.0         3.6          1.4         0.2
#> [6,]          5.4         3.9          1.7         0.4

Example 1

Let’s run SVD without scaling the features.

mat_svd <- base::svd(x = mat)

mat_svds <- RSpectra::svds(
  A    = mat,
  k    = n_pcs,
  opts = list(center = FALSE, scale = FALSE)
)

mat_irlba <- irlba::irlba(
  A      = mat,
  nv     = n_pcs,
  center = FALSE,
  scale  = FALSE
)
#> Warning in irlba::irlba(A = mat, nv = n_pcs, center = FALSE, scale = FALSE):
#> You're computing too large a percentage of total singular values, use a standard
#> svd instead.

All methods return the same values for u

mat_svd$u[1:5,1:n_pcs]
#>             [,1]      [,2]         [,3]
#> [1,] -0.06161685 0.1296114  0.002138597
#> [2,] -0.05807094 0.1110198  0.070672387
#> [3,] -0.05676305 0.1179665  0.004342549
#> [4,] -0.05665344 0.1053081  0.005924672
#> [5,] -0.06123020 0.1310898 -0.031881095
mat_svds$u[1:5,]
#>             [,1]      [,2]         [,3]
#> [1,] -0.06161685 0.1296114  0.002138597
#> [2,] -0.05807094 0.1110198  0.070672387
#> [3,] -0.05676305 0.1179665  0.004342549
#> [4,] -0.05665344 0.1053081  0.005924672
#> [5,] -0.06123020 0.1310898 -0.031881095
mat_irlba$u[1:5,]
#>             [,1]      [,2]         [,3]
#> [1,] -0.06161685 0.1296114  0.002138597
#> [2,] -0.05807094 0.1110198  0.070672387
#> [3,] -0.05676305 0.1179665  0.004342549
#> [4,] -0.05665344 0.1053081  0.005924672
#> [5,] -0.06123020 0.1310898 -0.031881095

All methods return the same values for v

mat_svd$v[,1:3]
#>            [,1]       [,2]        [,3]
#> [1,] -0.7511082  0.2841749  0.50215472
#> [2,] -0.3800862  0.5467445 -0.67524332
#> [3,] -0.5130089 -0.7086646 -0.05916621
#> [4,] -0.1679075 -0.3436708 -0.53701625
mat_svds$v
#>            [,1]       [,2]        [,3]
#> [1,] -0.7511082  0.2841749  0.50215472
#> [2,] -0.3800862  0.5467445 -0.67524332
#> [3,] -0.5130089 -0.7086646 -0.05916621
#> [4,] -0.1679075 -0.3436708 -0.53701625
mat_irlba$v
#>            [,1]       [,2]        [,3]
#> [1,] -0.7511082  0.2841749  0.50215472
#> [2,] -0.3800862  0.5467445 -0.67524332
#> [3,] -0.5130089 -0.7086646 -0.05916621
#> [4,] -0.1679075 -0.3436708 -0.53701625

All methods return the same values for d

mat_svd$d[1:n_pcs]
#> [1] 95.959914 17.761034  3.460931
mat_svds$d
#> [1] 95.959914 17.761034  3.460931
mat_irlba$d
#> [1] 95.959914 17.761034  3.460931

Example 2

Let’s try again, but this time let’s center and scale each feature.

mat_scaled <- scale(mat)
head(mat_scaled)
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,]   -0.8976739  1.01560199    -1.335752   -1.311052
#> [2,]   -1.1392005 -0.13153881    -1.335752   -1.311052
#> [3,]   -1.3807271  0.32731751    -1.392399   -1.311052
#> [4,]   -1.5014904  0.09788935    -1.279104   -1.311052
#> [5,]   -1.0184372  1.24503015    -1.335752   -1.311052
#> [6,]   -0.5353840  1.93331463    -1.165809   -1.048667

After scaling, the means should be equal to zero:

colMeans(mat)
#> Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#>     5.843333     3.057333     3.758000     1.199333
colMeans(mat_scaled)
#>  Sepal.Length   Sepal.Width  Petal.Length   Petal.Width 
#> -4.480675e-16  2.035409e-16 -2.844947e-17 -3.714621e-17

And the standard deviation should be equal to one:

apply(mat, 2, sd)
#> Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#>    0.8280661    0.4358663    1.7652982    0.7622377
apply(mat_scaled, 2, sd)
#> Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#>            1            1            1            1

Ok, let’s try SVD again:

mat_svd_scaled <- base::svd(x = mat_scaled)

mat_svds_scaled <- RSpectra::svds(
  A    = mat,
  k    = n_pcs,
  opts = list(center = TRUE, scale = TRUE)
)

mat_irlba_scaled <- irlba::irlba(
  A      = mat,
  nv     = n_pcs,
  center = colMeans(mat),
  scale  = apply(mat, 2, sd)
)
#> Warning in irlba::irlba(A = mat, nv = n_pcs, center = colMeans(mat), scale =
#> apply(mat, : You're computing too large a percentage of total singular values,
#> use a standard svd instead.

All methods return the same values for u

mat_svd_scaled$u[1:5,1:n_pcs]
#>             [,1]        [,2]         [,3]
#> [1,] -0.10823953 -0.04099580  0.027218646
#> [2,] -0.09945776  0.05757315  0.050003401
#> [3,] -0.11299630  0.02920003 -0.009420891
#> [4,] -0.10989710  0.05101939 -0.019457133
#> [5,] -0.11422046 -0.05524180 -0.003354363
mat_svds_scaled$u[1:5,]
#>            [,1]        [,2]         [,3]
#> [1,] 0.10823953  0.04099580 -0.027218646
#> [2,] 0.09945776 -0.05757315 -0.050003401
#> [3,] 0.11299630 -0.02920003  0.009420891
#> [4,] 0.10989710 -0.05101939  0.019457133
#> [5,] 0.11422046  0.05524180  0.003354363
mat_irlba_scaled$u[1:5,]
#>             [,1]        [,2]         [,3]
#> [1,] -0.10823953 -0.04099580  0.027218646
#> [2,] -0.09945776  0.05757315  0.050003401
#> [3,] -0.11299630  0.02920003 -0.009420891
#> [4,] -0.10989710  0.05101939 -0.019457133
#> [5,] -0.11422046 -0.05524180 -0.003354363

All methods return the same values for v

mat_svd_scaled$v[,1:3]
#>            [,1]        [,2]       [,3]
#> [1,]  0.5210659 -0.37741762  0.7195664
#> [2,] -0.2693474 -0.92329566 -0.2443818
#> [3,]  0.5804131 -0.02449161 -0.1421264
#> [4,]  0.5648565 -0.06694199 -0.6342727
mat_svds_scaled$v
#>            [,1]       [,2]       [,3]
#> [1,] -0.5210659 0.37741762 -0.7195664
#> [2,]  0.2693474 0.92329566  0.2443818
#> [3,] -0.5804131 0.02449161  0.1421264
#> [4,] -0.5648565 0.06694199  0.6342727
mat_irlba_scaled$v
#>            [,1]        [,2]       [,3]
#> [1,]  0.5210659 -0.37741762  0.7195664
#> [2,] -0.2693474 -0.92329566 -0.2443818
#> [3,]  0.5804131 -0.02449161 -0.1421264
#> [4,]  0.5648565 -0.06694199 -0.6342727

Uh oh! It looks like svds() is not returning the correct values for d

mat_svd_scaled$d[1:n_pcs]
#> [1] 20.853205 11.670070  4.676192
mat_svds_scaled$d
#> [1] 1.7083611 0.9560494 0.3830886
mat_irlba_scaled$d
#> [1] 20.853205 11.670070  4.676192

Example 3

Let’s try RSpectra again, but we’ll do the scaling manually.

mat_svds_man_scaled <- RSpectra::svds(
  A    = mat_scaled,
  k    = n_pcs,
  opts = list(center = FALSE, scale = FALSE)
)
mat_svds_man_scaled$d
#> [1] 20.853205 11.670070  4.676192

Conclusions

Manual scaling eliminates the error in the svds() output. This
implies that there might be a bug inside the svds() function.

The wrong values are off by a multiplicative constant factor,
but the value of the factor is dependent on the input data.

For the iris data, the factor seems to be 12.20656, which is close
to the number of matrix multiplication operations (11)

mat_svds_man_scaled$d / mat_svds_scaled$d
#> [1] 12.20656 12.20656 12.20656
mat_svds_scaled$nops
#> [1] 11

^{Created on 2022-01-25 by the reprex package (v2.0.1)}

Session info

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.3 (2020-10-10)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2022-01-25                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                             
#>  backports     1.2.1      2020-12-09 [1] CRAN (R 4.0.2)                     
#>  cli           3.0.1      2021-07-17 [1] CRAN (R 4.0.2)                     
#>  crayon        1.4.1      2021-02-08 [1] CRAN (R 4.0.2)                     
#>  digest        0.6.28     2021-09-23 [1] CRAN (R 4.0.2)                     
#>  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.0.2)                     
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.1)                     
#>  fansi         0.5.0      2021-05-25 [1] CRAN (R 4.0.2)                     
#>  fastmap       1.1.0      2021-01-25 [1] CRAN (R 4.0.2)                     
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 4.0.2)                     
#>  glue          1.4.2      2020-08-27 [2] CRAN (R 4.0.2)                     
#>  highr         0.9        2021-04-16 [1] CRAN (R 4.0.2)                     
#>  htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.0.2)                     
#>  irlba       * 2.3.3      2019-02-05 [2] CRAN (R 4.0.2)                     
#>  knitr         1.36       2021-09-29 [1] CRAN (R 4.0.2)                     
#>  lattice       0.20-45    2021-09-22 [1] CRAN (R 4.0.2)                     
#>  lifecycle     1.0.1      2021-09-24 [1] CRAN (R 4.0.2)                     
#>  magrittr      2.0.1.9000 2020-12-15 [1] Github (tidyverse/magrittr@bb1c86a)
#>  Matrix      * 1.3-4      2021-06-01 [1] CRAN (R 4.0.2)                     
#>  pillar        1.6.3      2021-09-26 [1] CRAN (R 4.0.2)                     
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.2)                     
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.2)                     
#>  R.cache       0.15.0     2021-04-30 [1] CRAN (R 4.0.2)                     
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                     
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                     
#>  R.utils       2.11.0     2021-09-26 [1] CRAN (R 4.0.2)                     
#>  Rcpp          1.0.7      2021-07-07 [1] CRAN (R 4.0.2)                     
#>  reprex        2.0.1      2021-08-05 [1] CRAN (R 4.0.2)                     
#>  rlang         0.4.11     2021-04-30 [1] CRAN (R 4.0.2)                     
#>  rmarkdown     2.11       2021-09-14 [1] CRAN (R 4.0.2)                     
#>  RSpectra    * 0.16-0     2019-12-01 [2] CRAN (R 4.0.2)                     
#>  rstudioapi    0.13       2020-11-12 [2] CRAN (R 4.0.2)                     
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.2)                     
#>  stringi       1.7.5      2021-10-04 [1] CRAN (R 4.0.2)                     
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.2)                     
#>  styler        1.6.2      2021-09-23 [1] CRAN (R 4.0.2)                     
#>  tibble        3.1.5      2021-09-30 [1] CRAN (R 4.0.2)                     
#>  utf8          1.2.2      2021-07-24 [1] CRAN (R 4.0.2)                     
#>  vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.0.2)                     
#>  withr         2.4.2      2021-04-18 [1] CRAN (R 4.0.2)                     
#>  xfun          0.26       2021-09-14 [1] CRAN (R 4.0.2)                     
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.2)                     
#> 
#> [1] /Users/kamil/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Installation error on Ubuntu

👋! I'm getting the installation error below for

sysname 
 "Linux" 
release 
"4.15.0-45-generic" 
version 
"#48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019"

Installing package into ‘/home/maelle/R/x86_64-pc-linux-gnu-library/3.4’
(as ‘lib’ is unspecified)
* installing *source* package ‘RSpectra’ ...
** libs
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include -I"/home/maelle/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include" -I"/home/maelle/R/x86_64-pc-linux-gnu-library/3.4/RcppEigen/include"    -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-AitvI6/r-base-3.4.4=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c eigs_gen.cpp -o eigs_gen.o
eigs_gen.cpp:1:10: fatal error: RcppEigen.h: No such file or directory
 #include <RcppEigen.h>
          ^~~~~~~~~~~~~
compilation terminated.
/usr/lib/R/etc/Makeconf:168: recipe for target 'eigs_gen.o' failed
make: *** [eigs_gen.o] Error 1
ERROR: compilation failed for package ‘RSpectra’
* removing ‘/home/maelle/R/x86_64-pc-linux-gnu-library/3.4/RSpectra’
Error in i.p(...) : 
  (converted from warning) installation of package ‘/tmp/RtmpHyJuEO/fileab463ae57e/RSpectra_0.13-1.tar.gz’ had non-zero exit status

eigs_sym has negative eigenvalue

I have a relatively large symmetric eigenvalue problem (ca. 12000x12000) and want the first couple of eigenvectors and values:

e <- eigs_sym(x, 6, opts = list(retvec = TRUE))
e$values

### [1]   68213.67   31349.46   24366.68   19671.44   17046.18 -102290.45

e <- eigs_sym(x, 5, opts = list(retvec = TRUE))
e$values

### [1]   68213.67   31349.46   24366.68   19671.44 -102290.45

Am I right, that the entire procedure is stable, but the last eigenvalue is garbage? e$nconv is 5 and 6 respectively.

Is the correct solution to compute one more eigenvector/value than necessary and throw the last one away?

how to control the cores used by svds

Hi yixuan,

I noticed that all machines' cores will be used when I run svds on my large matrix. Under some conditions, we may want to control the cores used by a program. How can I make it?

Also, I am writing a program using the Spectra (C++). There seems no way to run it in parallel. But RSpectra can take advantages of multi cores. What's the difference?

Looking forward to your reply.
Thanks in advance.

svds in version 0.13-1 vs 0.12-0

Dear Yixuan Qiu,

In the beginning of 2018, I installed the RSpectra 0.12-0 on my PC for using the SVDS function on mean-centered Mid Infrared spectra (X) to calculate the orthogonal signal (Wold et al., 1998).

e.g.: s <- RSpectra::svds(X,1,1,1)

In the meanwhile, I bought a new laptop and I was running the same code on the same dataset, giving totally different results. After a few hours of debugging, I found that the different results were caused by the version of the RSpectra package (RSpectra 0.13-1 was installed on the new laptop). After removing the RSpectra 0.13-1 and installing the RSpectra 0.12-0 on the laptop, I again obtained the same results as the ones I got before on the PC... But now I am wondering which version to use...
What has changed in de svds function in version 0.13-1 compared to 0.12-0? And which version is most suitable for my application?

BR,

Ben Aernouts
[email protected]

Reference: Wold, S., H. Antti, F. Lindgren, and J. Öhman. 1998. Orthogonal signal correction of near-infrared spectra. Chemom. Intell. Lab. Syst. 44:175–185. doi:10.1016/S0169-7439(98)00109-9.