Code Monkey home page Code Monkey logo

kaggle_pbr's Introduction

kaggle_pbr

My best submission to the Kaggle competition "Predicting a Biological Response", ranked 17th over 711 teams.

kaggle_pbr's People

Contributors

emanuele avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kaggle_pbr's Issues

error while executingblend.py

Hi, while executing blend.py, I am getting the following error:

Loading data...
Traceback (most recent call last):
File "blend.py", line 61, in
GradientBoostingClassifier(learn_rate=0.05, subsample=0.5, max_depth=6, n_estimators=50)]

TypeError: init() got an unexpected keyword argument 'learn_rate'

I am using scikit-learn version "0.17.dev0"

Thanks
mradul

blending.py failing on different data sets

Dear Emanuele,
Thanks for your generosity for making public your code for blending. It is a very good help.

I need your help regarding blend.py. While testing your code (blending.py), I could successfully run it on "bioresponse" data. However, with two different datasets, I am getting the almost the same error, which is:

Dataset 1:

Loading data...
/usr/lib64/python2.7/site-packages/sklearn/cross_validation.py:525: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=10.
% (min_labels, self.n_folds)), Warning)

Creating train and test sets for blending.
0 RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
Fold 0
Traceback (most recent call last):
File "blend.py", line 77, in
clf.fit(X_train, y_train)
File "/usr/lib64/python2.7/site-packages/sklearn/ensemble/forest.py", line 211, in fit
X = check_array(X, dtype=DTYPE, accept_sparse="csc")
File "/usr/lib64/python2.7/site-packages/sklearn/utils/validation.py", line 392, in check_array
% (n_samples, shape_repr, ensure_min_samples))

ValueError: Found array with 0 sample(s) (shape=(0, 0)) while a minimum of 1 is required.

The error with another dataset was:

Dataset 2:

Loading data...
/usr/lib64/python2.7/site-packages/sklearn/cross_validation.py:525: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=10.
% (min_labels, self.n_folds)), Warning)
Creating train and test sets for blending.
0 RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
Fold 0
Traceback (most recent call last):
File "blend.py", line 77, in
clf.fit(X_train, y_train)
File "/usr/lib64/python2.7/site-packages/sklearn/ensemble/forest.py", line 211, in fit
X = check_array(X, dtype=DTYPE, accept_sparse="csc")
File "/usr/lib64/python2.7/site-packages/sklearn/utils/validation.py", line 392, in check_array
% (n_samples, shape_repr, ensure_min_samples))

ValueError: Found array with 0 sample(s) (shape=(0, 45)) while a minimum of 1 is required.

I am using scikit-learn version 0.17.dev

IN both the case, problem is with the forest.oy, validation.py.When I use random forest and GBM individually from R, I am able to make the predictions but when used through your code, I am failing.

Can you please suggest where is the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.