Comments (9)
Non-contributor here:
What algorithms do you need? Do you have too much data (which can't be used through sparse-matrices)?
I do think, that the following algorithms need only minimal changes:
- SGDClassifier, SGDRegressor (already available in scikit-learn with partial_fit; only slightly different)
- AdaGradClassifier, AdaGradRegressor (slightly more work depending on internals)
- SAGClassifier, SAGRegressor (slightly more work depending on internals)
Impossible (algorithm-wise; batch-methods = full-gradient):
- FistaClassifier, FistaRegressor
- SVRGClassifier, SVRGRegressor
These could maybe work, but i'm unsure about the theory (there might be constraints on partial_fit; how to call it with which data):
- CDClassifier, CDRegressor
- SDCAClassifier, SDCARegressor
from lightning.
Thanks for the detailed overview!
I'm in reinforcement learning setup where the whole data is not available, and want to use a regression model which uses the data seen so far, without retraining it from scratch. I want to try an optimisation algorithm with an adaptive learning rate or a momentum, and lightning has a good AdaGradRegressor implementation.
from lightning.
Let's see what the developers think.
Just two random remarks:
- Did you try (carefully tuned) vanilla-SGD (the version in sklearn with partial_fit) for your use-case (i'm sceptical if AdaGrad is so much better, but this might be dependent on your data and i'm not an expert)
- There is a warm_start option in CDClassifier and SDCAClassifier... Maybe there is a clever way incorporate these possibilities in your setup
from lightning.
Yeah, I'm using vanilla SGD now; it works ok. The problem is that the component should work across many tasks, and it'd be nice to have less parameters to tune.
from lightning.
I was just about to start an issue on this. I'm training models on a really big file, so the data won't fit in memory at once. Streaming and parallelization are the only way to use the data. Vanilla SGD from scikit-learn takes tuning and doesn't improve from multiple iterations. The FTRL from Kaggler.py works better, but can't be pickled.
I had a look at modifying scikit-lightning for this. The outputs_2d_ initialization in fit() should be moved to init(), but also the Cython part should be modified so that it doesn't reset the model parameters when fit_partial is called. Would it be possible to get these changes?
from lightning.
Hi,
A patch that implements partial_fit would definitely be a nice addition !
Please submit a patch with the modifications that you propose. I'll
allocate time to review them.
On Jun 3, 2016 3:59 AM, "anttttti" [email protected] wrote:
I was just about to start an issue on this. I'm training models on a
really big file, so the data won't fit in memory at once. Streaming and
parallelization are the only way to use the data. Vanilla SGD from
scikit-learn takes tuning and doesn't improve from multiple iterations. The
FTRL from Kaggler.py works better, but can't be pickled.I had a look at modifying scikit-lightning for this. The outputs_2d_
initialization in fit() should be moved to init(), but also the Cython part
should be modified so that it doesn't reset the model parameters when
fit_partial is called. Would it be possible to get these changes?—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#78 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAQ8h7Ax4lxMfP7mutD_qXyPhAOQfCCwks5qH4pqgaJpZM4Is9SR
.
from lightning.
I didn't get a patch written, I hacked the code first to see how easily this could be done. I think I got it working for the AdaGradRegressor case, but the results were not good, so I think I missed something. The results from Adagrad without my hack weren't much better than SGD on my data, and FTRL from Kaggler was vastly better. This is a general result on SGD vs. FTRL with high-dimensional data. Anyway, I got a partial_fit FTRL working by adding model pickling to Kaggler instead. I could look at contributing to Lighting later.
Attached is the hack I wrote, in case someone wants to continue from that.
adagrad.py.txt
from lightning.
partial_fit is already supported in scikit-learn's SGD so I think we should focus on AdaGrad first.
@anttttti If you start a PR, we can help you track down the problem. Also make sure to write a unit test that checks that calling partial_fit multiple times is equivalent to fit.
from lightning.
I made a version of FTRL available as part of the package I made available:
https://github.com/anttttti/Wordbatch/blob/master/wordbatch/models/ftrl.pyx
This support partial fit and online learning, weighted features, link function for classification/regression, and does instance-level parallelization with OpenMP prange.
This script probably won't fit the scope of current sklearn-contrib-lightning, so I've released it independently for now.
from lightning.
Related Issues (20)
- Unsafe screening with CDClassifier? HOT 1
- Forgotten intercept in SGDRegressor HOT 2
- ModuleNotFoundError: No module named 'sklearn.externals.six' HOT 4
- How to pip sklearn-contrib-lightning HOT 2
- 2d errors when passing pandas DataFrame/Series
- Nonnegative penalties actually allowed in CDRegressors HOT 3
- FistaRegressor does not converge for real data HOT 3
- build pb on python 3.9 HOT 1
- Change assert imports HOT 2
- do you have Quantile Regression for spars data after one hot transformation
- do you have Regression for spars categorical big data after one hot transformation
- do you have support vector regression with soft margin and confidence interval ?
- do you have implementation for regression with confidence intervals ?
- Does lightning natively support multi-label HOT 1
- DOC: sometimes the Lasso solution is the same as sklearn, sometimes not HOT 5
- install help
- ENH - Add support of intercept in ``SDCARegressor`` HOT 1
- Why not initialize SAG/SAGA memory with 0 and divide by seen indices so far as in sklearn?
- Missing Support for class-weights specifications to tacke Disproportionate Number of Samples between dependent variable classes
- CDClassifier : error for penalty="l1" and penalty="l2", but no error for penalty="l1/l2" HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning.