Comments (3)
Great catch! You're correct, the issue you mentioned isn't related to dask
. The cross-covariance from your data has dimensions 64800 x 10368
, which is near the maximum size solvable on a typical laptop using np.linalg.svd
. For such large matrices, the randomized SVD from sklearn
is recommended as it's faster. That's why it is the default option in xeofs
.
However, xmca
is still able to handle the datasets with np.linalg.svd
, because it first conducts standard PCA on each data set prior to computing the cross-covariance. Given that your data sets have 492
time steps, they can be fully represented by 492
PCs. Thus, when performing MCA in this reduced PCA space, the SVD solution should be swift.
Currently, xeofs
doesn't offer this PCA preprocessing. But, good news – it's present in the development branch. If you need it urgently, install the development version:
pip install git+https://github.com/nicrie/xeofs.git@develop
The development version then allows you to specify the number of PCA modes. Since in your case the rank of the input matrices are 492, you can easily provide all PC modes so you don't loose any information by using the PCA solutions as input for MCA. In the development version you would specify it like this:
model = MCA(n_modes=5, standardize=False, use_coslat=True, n_pca_modes=492),
model.fit(data_input1, data_input2, dim='time')
Otherwise, if you don't mind waiting a couple of days more, I'll be releasing a new version with this feature soon.
from xeofs.
Yeah, just what I wanted, thank you. I found this trick, looking forward to the new release
from xeofs.
this feature is now available in the new release v.1.1.0
(see #80 )
from xeofs.
Related Issues (20)
- Centering the input data is not equivalent to removing the linear trend beforehand HOT 5
- `dask`-based computation not entirely lazy HOT 3
- ModuleNotFound Error datatree HOT 5
- `scores()` don't match `transform()` with dask data HOT 4
- Improving release process
- Keep the documentation in sync with the code HOT 3
- Add dependencies section in the documentation HOT 1
- numpy and pandas dependency question
- numpy and pandas dependencies HOT 3
- Migrating repository to xarray-contrib HOT 2
- why the explained_variance_ratio of CCA so small HOT 4
- Serialization fails with `xarray>=2024.1.0` HOT 2
- Single mode data reconstruction fails with normalized scores HOT 9
- Support for complex input data
- MCA incorrect coords alignment in transform method HOT 3
- Is the reconstruction of unseen data possible using EOF? HOT 5
- Model cannot fit DataArray without coordiantes
- Add option `n_modes="all"` to perform the full decomposition HOT 2
- Trouble creating the eof object HOT 8
- TypeError: __init__() missing 1 required positional argument: 'X' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xeofs.