Comments (4)
I used anaconda prompt to run the script, I suspect the error was because I didn't use ubuntu system.
I just installed ubuntu and ran the command by both Neptune and local, but the script stuck at generating installments_payments_handcrafted in both cases.
The messages are like this:
2018-08-23 16:14:57 steppy >>> Step installment_payments_hand_crafted, adapting inputs... 2018-08-23 16:14:57 steppy >>> Step installment_payments_hand_crafted, fitting and transforming... 0%| | 0/4.0 [00:00<?, ?it/s]/home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile interpolation=interpolation) /home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/nanfunctions.py:1018: RuntimeWarning: Mean of empty slice return np.nanmean(a, axis, out=out, keepdims=keepdims)
From this point on there are no more messages coming out and in the experiment directory there are no more transformer files generated. I'm wondering why the script stuck. Can't find a solution to this problem on Google.
from open-solution-home-credit.
installments_payments_handcrafted files are generating very long. How long have you been waiting for this procsess? How many cores you used for this?
from open-solution-home-credit.
I‘ve been waiting for about an hour. When I ran under dev_mode ( use 1000 samples only), the whole process can finish without issue.
Step installment_payments_hand_crafted, fitting and transforming... 0%| | 0/4.0 [00:00<?, ?it/s]
This progress bar was filled and the next step pos_cash_balance came out.
However when I ran the script under the state training on the whole sample, CLI was stuck at the progress bar, and soon the error messages came out.
/home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile interpolation=interpolation) /home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/nanfunctions.py:1018: RuntimeWarning: Mean of empty slice return np.nanmean(a, axis, out=out, keepdims=keepdims)
From this point on there were no more messages coming out from CLI. I'm using a 4-cores CPU, don't know if all of them were used while training but task manager showed 50 percent CPU resources were used. I also had 8 GB free physical memory, don't know if it's the issue. 90 percent of the memory was used when it reach the installments_payments_handcrafted step.
from open-solution-home-credit.
Hi there.
It takes quite a while to generate all the features.
You can comment stuff out in the feature extractor in pipeline_blocks.py. the slowest bits are installments and pos cash.
You can also run it in the cloud on a stronger machine. Something like 12 cores and 32g memory should get you there in a few hours.
from open-solution-home-credit.
Related Issues (20)
- xgboost failing to import with requirements.txt HOT 1
- train auc score, eval auc score differ why?
- ID renewal features
- different feature number on different folds (categoricals)
- RuntimeWarning:Invalid value encountered in percentile interpolation = interpolation
- How to run models without calculating features again? HOT 1
- use lightGBM_stacking pipeline raise error HOT 8
- CV improved LB not HOT 1
- Notebook Updated ? HOT 2
- train lgbm with number of iterations set
- train stacking with features (different feature subsets)
- explore features based on expert knowledge HOT 1
- Add first-installments based features
- Aggregate with respect to the loan lenght
- Use 'NAME_PRODUCT_TYPE' feature HOT 2
- ValueError: No transformer cached credit_card_balance_cleaning_fold_1 HOT 1
- How to export the feature correlation? HOT 4
- ModuleNotFoundError: No module named 'steppy.adapters'
- Best configurations and models used
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open-solution-home-credit.