The kaggle_pbr from emanuele

blending.py failing on different data sets

Dear Emanuele,
Thanks for your generosity for making public your code for blending. It is a very good help.

I need your help regarding blend.py. While testing your code (blending.py), I could successfully run it on "bioresponse" data. However, with two different datasets, I am getting the almost the same error, which is:

Dataset 1:

Creating train and test sets for blending.
0 RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
Fold 0
Traceback (most recent call last):
File "blend.py", line 77, in
clf.fit(X_train, y_train)
File "/usr/lib64/python2.7/site-packages/sklearn/ensemble/forest.py", line 211, in fit
X = check_array(X, dtype=DTYPE, accept_sparse="csc")
File "/usr/lib64/python2.7/site-packages/sklearn/utils/validation.py", line 392, in check_array
% (n_samples, shape_repr, ensure_min_samples))

ValueError: Found array with 0 sample(s) (shape=(0, 0)) while a minimum of 1 is required.

The error with another dataset was:

Dataset 2:

Loading data...
/usr/lib64/python2.7/site-packages/sklearn/cross_validation.py:525: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=10.
% (min_labels, self.n_folds)), Warning)
Creating train and test sets for blending.
0 RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
Fold 0
Traceback (most recent call last):
File "blend.py", line 77, in
clf.fit(X_train, y_train)
File "/usr/lib64/python2.7/site-packages/sklearn/ensemble/forest.py", line 211, in fit
X = check_array(X, dtype=DTYPE, accept_sparse="csc")
File "/usr/lib64/python2.7/site-packages/sklearn/utils/validation.py", line 392, in check_array
% (n_samples, shape_repr, ensure_min_samples))

ValueError: Found array with 0 sample(s) (shape=(0, 45)) while a minimum of 1 is required.

I am using scikit-learn version 0.17.dev

IN both the case, problem is with the forest.oy, validation.py.When I use random forest and GBM individually from R, I am able to make the predictions but when used through your code, I am failing.

Can you please suggest where is the problem.

error while executingblend.py

Hi, while executing blend.py, I am getting the following error:

Loading data...
Traceback (most recent call last):
File "blend.py", line 61, in
GradientBoostingClassifier(learn_rate=0.05, subsample=0.5, max_depth=6, n_estimators=50)]

TypeError: init() got an unexpected keyword argument 'learn_rate'

I am using scikit-learn version "0.17.dev0"

Thanks
mradul

Can you please explain why not do CV on LR?

Can you please explain why not do CV on LR?
Is it possible to be overfit without CV?

emanuele / kaggle_pbr Goto Github PK

kaggle_pbr's People

Stargazers

Watchers

Forkers

kaggle_pbr's Issues

blending.py failing on different data sets

I need your help regarding blend.py. While testing your code (blending.py), I could successfully run it on "bioresponse" data. However, with two different datasets, I am getting the almost the same error, which is:

Dataset 1:

ValueError: Found array with 0 sample(s) (shape=(0, 0)) while a minimum of 1 is required.

The error with another dataset was:

Dataset 2:

ValueError: Found array with 0 sample(s) (shape=(0, 45)) while a minimum of 1 is required.

error while executingblend.py

Hi, while executing blend.py, I am getting the following error:

TypeError: init() got an unexpected keyword argument 'learn_rate'

Can you please explain why not do CV on LR?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent