Repo to save Kaggle notebooks and discussion
baobach / kaggle-homecreditcomp Goto Github PK
View Code? Open in Web Editor NEWRepo to save Kaggle notebooks and discussion
Repo to save Kaggle notebooks and discussion
Number of payments made for the previous application.
Credit card credit limit provided for previous applications.
Number of instalments in the previous application. Here, ordinary statistical features can be done.
Flag for determining if the product is a cross-sell in previous applications. False accounts for the majority, and missing values can be filled with False.
Type of point of sale. Here, we can consider one-hot encoding.
Applicant's education level from their previous application. However, this information seems to have been encrypted, so it can only be encoded using one-hot-encoding.
Actual balance on credit account. This balance needs to be updated.
Contract activation date of the applicant's previous application. This may not be of much use?
Monthly annuity for previous applications. This variable can be considered using conventional statistical modeling methods.
The status of the previous credit card. The missing value here may be that there is no card. However, we cannot determine what the missing value is, so we can consider filling it in with 'UNCONFIRMED'.
Client's main income amount in their previous application.
How many children were there in the last application. Considering that first-time users of credit cards may be young or have no money, fill in the missing value as 0.
Previous application down payment amount. Numerical variable.
Minimum historical balance of previous credit accounts. If the balance is negative, there is a possibility of default.
Profession of the client during their previous loan application. Here, 'a55475b1' accounts for the majority.
Reason for rejection of the client's previous application.Here,'a55475b1' accounts for the majority.
Application cancellation reason. There are many categories here, except for 'a55475b1', which are all strings with the shape of 'P94_109_143'. Using one-hot encoding here will consume a lot of memory, so consider using the first letter as the one-hot encoding.
This is the contact information left by someone when they applied for a credit card before. Missing values may be due to not leaving contact information. This is a new state.
Date of last payment made by the applicant. This may not be of much use?
Date of first instalment in the previous application.
Number of transactions made with the previous credit account of the applicant. We need everyone's newest data here.
Employment start date from the previous application. Here, we can consider the number of people in each state or perform one hot encoding.
Previous application flag indicating if product being applied for is a debit card. False accounts for the majority, and missing values can be filled with False.
Credit type of previous application. Here we use regular one hot encoding.
Applicant's income from previous applications. The income level is related to whether there is a default, and filling in missing values here can be done using the median.
Important!
Should investigate further should use other statistical property or function or ML model to fill. Using median is not an optimal since this should be a critical feature.
District of the address used in the previous loan application. May we consider the wealth gap in each region?
Type of the initial transaction made in the previous application of the client. Here we can use one-hot-encoding.
Previous application status. Here, we can consider one-hot encoding.
This is the number of days overdue from the previous contract, the more days there are, the more severe it becomes. If the overdue days are 0, they may need to be separately classified as Class 1, or the days can be classified into different levels. For filling in missing values, "ffill" can be considered.
Revolving account that was present in the applicant's previous application. This feeling is not very useful.
Amount of outstanding debt on the client's previous application. It can be divided into two categories based on whether it is 0 or not.
Date of the applicant's last payment. Is there any difference between this and the previous feature dtlastpmt_581D
? If these two columns are not missing values and there are still some different data, why is this?
Previous application's current debt. People without debt are even less likely to default, and we also need to obtain the latest records here.
Date when previous application was created. This can be compared to the approval date of the previous application. If the approval time is too long, may he have a breach of contract record?
Maximum DPD with tolerance (on previous application/s).
Family State in previous application of applicant. We can consider living alone or using simple one hot encoding here.
Maximal historical balance of previous credit account. The more balance there is, the less likely it is to default?
Loan amount or card limit of previous applications.
Card blocking reason. Credit card blockage is usually caused by the bank freezing your credit card, which may be due to your failure to repay the money or improper use of the credit card. There is no explanation for the string here online, and it may be due to an internal string. Is the missing string possibly due to not being frozen or not having a credit card?
Reason for previous application rejection. Here,'a55475b1' accounts for the majority.
Approval Date of Previous Application. Missing values may not have been applied for before, and the date column can be divided into two categories.
Account status of previous credit applications. Here, we can consider one-hot encoding.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.