Code Monkey home page Code Monkey logo

kaggle-homecreditcomp's Introduction

Kaggle-HomeCreditComp

Repo to save Kaggle notebooks and discussion

kaggle-homecreditcomp's People

Contributors

baobach avatar

Watchers

 avatar

kaggle-homecreditcomp's Issues

pmtnum_8L

Number of payments made for the previous application.

tenor_203L

Number of instalments in the previous application. Here, ordinary statistical features can be done.

isbidproduct_390L

Flag for determining if the product is a cross-sell in previous applications. False accounts for the majority, and missing values can be filled with False.

postype_4733339M

Type of point of sale. Here, we can consider one-hot encoding.

education_1138M

Applicant's education level from their previous application. However, this information seems to have been encrypted, so it can only be encoded using one-hot-encoding.

dateactivated_425D

Contract activation date of the applicant's previous application. This may not be of much use?

annuity_853A

Monthly annuity for previous applications. This variable can be considered using conventional statistical modeling methods.

credacc_cards_status_52L

The status of the previous credit card. The missing value here may be that there is no card. However, we cannot determine what the missing value is, so we can consider filling it in with 'UNCONFIRMED'.

childnum_21L

How many children were there in the last application. Considering that first-time users of credit cards may be young or have no money, fill in the missing value as 0.

downpmt_134A

Previous application down payment amount. Numerical variable.

credacc_minhisbal_90A

Minimum historical balance of previous credit accounts. If the balance is negative, there is a possibility of default.

profession_152M

Profession of the client during their previous loan application. Here, 'a55475b1' accounts for the majority.

cancelreason_3545846M

Application cancellation reason. There are many categories here, except for 'a55475b1', which are all strings with the shape of 'P94_109_143'. Using one-hot encoding here will consume a lot of memory, so consider using the first letter as the one-hot encoding.

conts_type_509L

This is the contact information left by someone when they applied for a credit card before. Missing values may be due to not leaving contact information. This is a new state.

dtlastpmt_581D

Date of last payment made by the applicant. This may not be of much use?

credacc_transactions_402L

Number of transactions made with the previous credit account of the applicant. We need everyone's newest data here.

employedfrom_700D

Employment start date from the previous application. Here, we can consider the number of people in each state or perform one hot encoding.

isdebitcard_527L

Previous application flag indicating if product being applied for is a debit card. False accounts for the majority, and missing values can be filled with False.

credtype_587L

Credit type of previous application. Here we use regular one hot encoding.

byoccupationinc_3656910L

Applicant's income from previous applications. The income level is related to whether there is a default, and filling in missing values here can be done using the median.
Important!
Should investigate further should use other statistical property or function or ML model to fill. Using median is not an optimal since this should be a critical feature.

district_544M

District of the address used in the previous loan application. May we consider the wealth gap in each region?

inittransactioncode_279L

Type of the initial transaction made in the previous application of the client. Here we can use one-hot-encoding.

status_219L

Previous application status. Here, we can consider one-hot encoding.

actualdpd_943P

This is the number of days overdue from the previous contract, the more days there are, the more severe it becomes. If the overdue days are 0, they may need to be separately classified as Class 1, or the days can be classified into different levels. For filling in missing values, "ffill" can be considered.

revolvingaccount_394A

Revolving account that was present in the applicant's previous application. This feeling is not very useful.

outstandingdebt_522A

Amount of outstanding debt on the client's previous application. It can be divided into two categories based on whether it is 0 or not.

dtlastpmtallstes_3545839D

Date of the applicant's last payment. Is there any difference between this and the previous feature dtlastpmt_581D? If these two columns are not missing values and there are still some different data, why is this?

currdebt_94A

Previous application's current debt. People without debt are even less likely to default, and we also need to obtain the latest records here.

creationdate_885D

Date when previous application was created. This can be compared to the approval date of the previous application. If the approval time is too long, may he have a breach of contract record?

familystate_726L

Family State in previous application of applicant. We can consider living alone or using simple one hot encoding here.

credacc_maxhisbal_375A

Maximal historical balance of previous credit account. The more balance there is, the less likely it is to default?

cacccardblochreas_147M

Card blocking reason. Credit card blockage is usually caused by the bank freezing your credit card, which may be due to your failure to repay the money or improper use of the credit card. There is no explanation for the string here online, and it may be due to an internal string. Is the missing string possibly due to not being frozen or not having a credit card?

rejectreason_755M

Reason for previous application rejection. Here,'a55475b1' accounts for the majority.

approvaldate_319D

Approval Date of Previous Application. Missing values may not have been applied for before, and the date column can be divided into two categories.

credacc_status_367L

Account status of previous credit applications. Here, we can consider one-hot encoding.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.