Comments (10)
Hey there, could you provide me with a sample of your training data?
I am unable to reproduce this issue.
from hyperopt-sklearn.
I just got same error during HyperoptEstimator search, hope the following log helps
2|hpsklearnBlack | job exception: Input contains NaN.
98%|█████████▊| 48/49 [00:02<?, ?trial/s, best loss=?]
2|hpsklearnBlack | Traceback (most recent call last):
2|hpsklearnBlack | File "/home/ubuntu/python/painter/Test/playground_gbm.py", line 48, in <module>
2|hpsklearnBlack | find_best_model(X_train, y_train, X_test, y_test)
2|hpsklearnBlack | File "/home/ubuntu/python/painter/Test/playground_gbm.py", line 29, in find_best_model
2|hpsklearnBlack | estimator.fit(x, y)
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hpsklearn/estimator/estimator.py", line 464, in fit
2|hpsklearnBlack | fit_iter.send(increment)
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hpsklearn/estimator/estimator.py", line 339, in fit_iter
2|hpsklearnBlack | hyperopt.fmin(_fn_with_timeout,
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/fmin.py", line 540, in fmin
2|hpsklearnBlack | return trials.fmin(
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/base.py", line 671, in fmin
2|hpsklearnBlack | return fmin(
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/fmin.py", line 586, in fmin
2|hpsklearnBlack | rval.exhaust()
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/fmin.py", line 364, in exhaust
2|hpsklearnBlack | self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/fmin.py", line 300, in run
2|hpsklearnBlack | self.serial_evaluate()
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/fmin.py", line 178, in serial_evaluate
2|hpsklearnBlack | result = self.domain.evaluate(spec, ctrl)
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hyperopt/base.py", line 892, in evaluate
2|hpsklearnBlack | rval = self.fn(pyll_rval)
2|hpsklearnBlack | File "/home/ubuntu/python/painter/venv/lib/python3.10/site-packages/hpsklearn/estimator/estimator.py", line 311, in _fn_with_timeout
2|hpsklearnBlack | raise fn_rval[1]
2|hpsklearnBlack | ValueError: Input contains NaN.
According the result from df.isnull().sum() there is no NaN in data, hence, the NaH error occurs during parameter injection.
from hyperopt-sklearn.
@RaistlinTAO Could I get a snippet of your code please?
I understand that your data does not contain nan values. But I need to see what code exactly is causing the issue.
from hyperopt-sklearn.
@RaistlinTAO Could I get a snippet of your code please? I understand that your data does not contain nan values. But I need to see what code exactly is causing the issue.
Of course, happy to help
def find_best_model(x, y, test_x, test_y):
estimator = HyperoptEstimator(
regressor=gradient_boosting_regressor("T"),
algo=tpe.suggest,
max_evals=800,
trial_timeout=300)
estimator.fit(x, y)
print('HyperoptEstimator Score: ')
print(estimator.score(test_x, test_y))
print('Best Model: ')
print(estimator.best_model())
find_best_model(X_train, y_train, X_test, y_test)
from hyperopt-sklearn.
Update:
With same code and same settings, it will bypass the error after give it another 3-5 tries.
I think it related to hyperparameter combination. sometimes it just skip the wrong combination or preprocessing
from hyperopt-sklearn.
Thanks a lot @RaistlinTAO that is what I was thinking as well. I've noticed this behaviour before.
Unfortunately, I cannot reproduce the error in our testing environment so it is a difficult issue to fix.
Your code snippet will help me to point down the problem and focus my testing.
I will work on a fix for this. But for the time being, retrying a few times should bypass the error.
from hyperopt-sklearn.
After several attempts on debug in local env I have located the root cause.
In sklearn\ensemble_gb_losses.py
update_terminal_regions() statement:
def _update_terminal_region(...):
...
diff_minus_median = diff - median
...
# and
def update_terminal_regions(...):
raw_predictions[:, k] += learning_rate * tree.predict(X).ravel()
SAMPLE OUTPUT
E:\Projects\hyperopt-sklearn\venv\lib\site-packages\sklearn\ensemble\_gb_losses.py:231: RuntimeWarning: overflow encountered in square
* np.sum(sample_weight * ((y - raw_predictions.ravel()) ** 2))
E:\Projects\hyperopt-sklearn\venv\lib\site-packages\sklearn\ensemble\_gb_losses.py:231: RuntimeWarning: overflow encountered in square
* np.sum(sample_weight * ((y - raw_predictions.ravel()) ** 2))
E:\Projects\hyperopt-sklearn\venv\lib\site-packages\sklearn\ensemble\_gb_losses.py:288: RuntimeWarning: overflow encountered in multiply
raw_predictions[:, k] += learning_rate * tree.predict(X).ravel()
E:\Projects\hyperopt-sklearn\venv\lib\site-packages\sklearn\ensemble\_gb_losses.py:288: RuntimeWarning: invalid value encountered in add
raw_predictions[:, k] += learning_rate * tree.predict(X).ravel()
97%|█████████▋| 32/33 [00:06<?, ?trial/s, best loss=?]
job exception: Input contains NaN.
There are two different outputs indicate that diff/median/learning_rate is NaN under certain circumstances.
I ll just leave this here since my hands are tied atm.
from hyperopt-sklearn.
Hello,
I get the same issue, is there a fix for this one ?
self.estim = HyperoptEstimator(
regressor = any_regressor("my_clf"),
preprocessing = [],
algo = tpe.suggest,
max_evals = 65,
trial_timeout = 120
)
Stack trace :
100%|██████████| 1/1 [00:00<00:00, 5.23trial/s, best loss: 0.05266970464190446]
100%|██████████| 2/2 [00:00<00:00, 2.18trial/s, best loss: 0.033195277678000346]
100%|██████████| 3/3 [00:00<00:00, 3.74trial/s, best loss: 0.033195277678000346]
100%|██████████| 4/4 [00:02<00:00, 2.30s/trial, best loss: 0.03245292544638456]
100%|██████████| 5/5 [00:00<00:00, 16.75trial/s, best loss: 0.03245292544638456]
100%|██████████| 6/6 [00:00<00:00, 13.88trial/s, best loss: 0.03245292544638456]
100%|██████████| 7/7 [00:00<00:00, 1.38trial/s, best loss: 0.03245292544638456]
100%|██████████| 8/8 [00:01<00:00, 1.57s/trial, best loss: 0.03245292544638456]
100%|██████████| 9/9 [00:00<00:00, 13.97trial/s, best loss: 0.03245292544638456]
100%|██████████| 10/10 [00:00<00:00, 16.76trial/s, best loss: 0.03245292544638456]
100%|██████████| 11/11 [00:00<00:00, 7.96trial/s, best loss: 0.03245292544638456]
100%|██████████| 12/12 [00:00<00:00, 1.48trial/s, best loss: 0.03245292544638456]
100%|██████████| 13/13 [00:00<00:00, 2.03trial/s, best loss: 0.030216599846460745]
100%|██████████| 14/14 [00:00<00:00, 16.23trial/s, best loss: 0.030216599846460745]
100%|██████████| 15/15 [00:00<00:00, 2.10trial/s, best loss: 0.030216599846460745]
100%|██████████| 16/16 [00:00<00:00, 17.83trial/s, best loss: 0.030216599846460745]
100%|██████████| 17/17 [00:00<00:00, 1.30trial/s, best loss: 0.030216599846460745]
100%|██████████| 18/18 [00:00<00:00, 15.57trial/s, best loss: 0.030216599846460745]
100%|██████████| 19/19 [00:00<00:00, 1.66trial/s, best loss: 0.030216599846460745]
100%|██████████| 20/20 [00:00<00:00, 17.59trial/s, best loss: 0.030216599846460745]
100%|██████████| 21/21 [00:00<00:00, 10.58trial/s, best loss: 0.030216599846460745]
100%|██████████| 22/22 [00:00<00:00, 7.55trial/s, best loss: 0.030216599846460745]
100%|██████████| 23/23 [00:02<00:00, 2.01s/trial, best loss: 0.030216599846460745]
100%|██████████| 24/24 [02:00<00:00, 120.18s/trial, best loss: 0.030216599846460745]
100%|██████████| 25/25 [00:02<00:00, 2.64s/trial, best loss: 0.030216599846460745]
100%|██████████| 26/26 [00:00<00:00, 7.15trial/s, best loss: 0.030216599846460745]
100%|██████████| 27/27 [00:01<00:00, 1.15s/trial, best loss: 0.030216599846460745]
100%|██████████| 28/28 [00:00<00:00, 1.77trial/s, best loss: 0.030216599846460745]
100%|██████████| 29/29 [00:00<00:00, 10.25trial/s, best loss: 0.030216599846460745]
100%|██████████| 30/30 [00:09<00:00, 9.18s/trial, best loss: 0.030216599846460745]
100%|██████████| 31/31 [00:00<00:00, 10.90trial/s, best loss: 0.030216599846460745]
100%|██████████| 32/32 [00:01<00:00, 1.49s/trial, best loss: 0.027541941347137833]
97%|█████████▋| 32/33 [00:00<?, ?trial/s, best loss=?]
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
from hyperopt-sklearn.
Hello, I get the same issue, is there a fix for this one ?
self.estim = HyperoptEstimator( regressor = any_regressor("my_clf"), preprocessing = [], algo = tpe.suggest, max_evals = 65, trial_timeout = 120 )
Stack trace :
100%|██████████| 1/1 [00:00<00:00, 5.23trial/s, best loss: 0.05266970464190446] 100%|██████████| 2/2 [00:00<00:00, 2.18trial/s, best loss: 0.033195277678000346] 100%|██████████| 3/3 [00:00<00:00, 3.74trial/s, best loss: 0.033195277678000346] 100%|██████████| 4/4 [00:02<00:00, 2.30s/trial, best loss: 0.03245292544638456] 100%|██████████| 5/5 [00:00<00:00, 16.75trial/s, best loss: 0.03245292544638456] 100%|██████████| 6/6 [00:00<00:00, 13.88trial/s, best loss: 0.03245292544638456] 100%|██████████| 7/7 [00:00<00:00, 1.38trial/s, best loss: 0.03245292544638456] 100%|██████████| 8/8 [00:01<00:00, 1.57s/trial, best loss: 0.03245292544638456] 100%|██████████| 9/9 [00:00<00:00, 13.97trial/s, best loss: 0.03245292544638456] 100%|██████████| 10/10 [00:00<00:00, 16.76trial/s, best loss: 0.03245292544638456] 100%|██████████| 11/11 [00:00<00:00, 7.96trial/s, best loss: 0.03245292544638456] 100%|██████████| 12/12 [00:00<00:00, 1.48trial/s, best loss: 0.03245292544638456] 100%|██████████| 13/13 [00:00<00:00, 2.03trial/s, best loss: 0.030216599846460745] 100%|██████████| 14/14 [00:00<00:00, 16.23trial/s, best loss: 0.030216599846460745] 100%|██████████| 15/15 [00:00<00:00, 2.10trial/s, best loss: 0.030216599846460745] 100%|██████████| 16/16 [00:00<00:00, 17.83trial/s, best loss: 0.030216599846460745] 100%|██████████| 17/17 [00:00<00:00, 1.30trial/s, best loss: 0.030216599846460745] 100%|██████████| 18/18 [00:00<00:00, 15.57trial/s, best loss: 0.030216599846460745] 100%|██████████| 19/19 [00:00<00:00, 1.66trial/s, best loss: 0.030216599846460745] 100%|██████████| 20/20 [00:00<00:00, 17.59trial/s, best loss: 0.030216599846460745] 100%|██████████| 21/21 [00:00<00:00, 10.58trial/s, best loss: 0.030216599846460745] 100%|██████████| 22/22 [00:00<00:00, 7.55trial/s, best loss: 0.030216599846460745] 100%|██████████| 23/23 [00:02<00:00, 2.01s/trial, best loss: 0.030216599846460745] 100%|██████████| 24/24 [02:00<00:00, 120.18s/trial, best loss: 0.030216599846460745] 100%|██████████| 25/25 [00:02<00:00, 2.64s/trial, best loss: 0.030216599846460745] 100%|██████████| 26/26 [00:00<00:00, 7.15trial/s, best loss: 0.030216599846460745] 100%|██████████| 27/27 [00:01<00:00, 1.15s/trial, best loss: 0.030216599846460745] 100%|██████████| 28/28 [00:00<00:00, 1.77trial/s, best loss: 0.030216599846460745] 100%|██████████| 29/29 [00:00<00:00, 10.25trial/s, best loss: 0.030216599846460745] 100%|██████████| 30/30 [00:09<00:00, 9.18s/trial, best loss: 0.030216599846460745] 100%|██████████| 31/31 [00:00<00:00, 10.90trial/s, best loss: 0.030216599846460745] 100%|██████████| 32/32 [00:01<00:00, 1.49s/trial, best loss: 0.027541941347137833] 97%|█████████▋| 32/33 [00:00<?, ?trial/s, best loss=?] ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Well my suggestions are:
- Use Optuna (https://github.com/optuna/optuna),
however they have different issues, but it's rare. - Use GridSearchCV, but considering the time/resource consumption.
Pick any solution that fit your needs and have a good one
from hyperopt-sklearn.
I think this have sth to do with the gradient. Try to use gradient clip to avoid this problem when it comes to some extreme hyperparameter combination.
Apply gradient clipping to prevent exploding gradients, which can cause NaN values.
from hyperopt-sklearn.
Related Issues (20)
- install requirment HOT 2
- max_depth=None setting overridden in components._trees_hp_space HOT 4
- Update pypi project page HOT 1
- No module named "hpsklearn.estimator" HOT 15
- AttributeError: module 'hyperopt' has no attribute 'pyll' HOT 1
- module 'hyperopt.pyll' has no attribute 'base' HOT 2
- distribute via gh vs pypi? HOT 1
- Catboost classification and regression HOT 2
- InvalidParameterError, possible package version issue? HOT 2
- Add link to license in readme
- Hyperopt with multioutput
- Seeking advice on adapting a normal sklearn pipeline/search space to Hyperopt
- InvalidParameterError when running the classifier HOT 2
- Hyperopt for linear_svc provided sub optimal hyperparameters than the default parameters of scikit-learn. HOT 2
- Add LICENSE to PyPi sdist HOT 1
- gradient_boosting_regressor loss HOT 1
- ImportError: No module named xgboost HOT 1
- Iris and MNIST example don't behave as in the comment and give error
- error:The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hyperopt-sklearn.