Code Monkey home page Code Monkey logo

Comments (4)

MRrollingJerry avatar MRrollingJerry commented on May 27, 2024

I used anaconda prompt to run the script, I suspect the error was because I didn't use ubuntu system.
I just installed ubuntu and ran the command by both Neptune and local, but the script stuck at generating installments_payments_handcrafted in both cases.

The messages are like this:
2018-08-23 16:14:57 steppy >>> Step installment_payments_hand_crafted, adapting inputs... 2018-08-23 16:14:57 steppy >>> Step installment_payments_hand_crafted, fitting and transforming... 0%| | 0/4.0 [00:00<?, ?it/s]/home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile interpolation=interpolation) /home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/nanfunctions.py:1018: RuntimeWarning: Mean of empty slice return np.nanmean(a, axis, out=out, keepdims=keepdims)

From this point on there are no more messages coming out and in the experiment directory there are no more transformer files generated. I'm wondering why the script stuck. Can't find a solution to this problem on Google.

from open-solution-home-credit.

Ninoko avatar Ninoko commented on May 27, 2024

installments_payments_handcrafted files are generating very long. How long have you been waiting for this procsess? How many cores you used for this?

from open-solution-home-credit.

MRrollingJerry avatar MRrollingJerry commented on May 27, 2024

I‘ve been waiting for about an hour. When I ran under dev_mode ( use 1000 samples only), the whole process can finish without issue.
Step installment_payments_hand_crafted, fitting and transforming... 0%| | 0/4.0 [00:00<?, ?it/s]

This progress bar was filled and the next step pos_cash_balance came out.

However when I ran the script under the state training on the whole sample, CLI was stuck at the progress bar, and soon the error messages came out.

/home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile interpolation=interpolation) /home/mrrolling/home-credit/lib/python3.5/site-packages/numpy/lib/nanfunctions.py:1018: RuntimeWarning: Mean of empty slice return np.nanmean(a, axis, out=out, keepdims=keepdims)

From this point on there were no more messages coming out from CLI. I'm using a 4-cores CPU, don't know if all of them were used while training but task manager showed 50 percent CPU resources were used. I also had 8 GB free physical memory, don't know if it's the issue. 90 percent of the memory was used when it reach the installments_payments_handcrafted step.

from open-solution-home-credit.

jakubczakon avatar jakubczakon commented on May 27, 2024

Hi there.

It takes quite a while to generate all the features.

You can comment stuff out in the feature extractor in pipeline_blocks.py. the slowest bits are installments and pos cash.

You can also run it in the cloud on a stronger machine. Something like 12 cores and 32g memory should get you there in a few hours.

from open-solution-home-credit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.