Code Monkey home page Code Monkey logo

kaggle-titanic-top3-percent's Introduction

Kaggle-Titanic-Survival-prediction

Kaggle-Titanic-Survival-prediction
E-mail : [email protected]


結論 : Titanic為機器學習的入門款項目,Kaggle上有許多解法及思路, 然而沒有一篇使用較少的特徵達到較好的效果,多半使用大量的調參技巧, 少量,有品質的特徵提供穩定的預測品質,快速的訓練及預測,
你可以在這裡找到使用前項選擇法來做特徵選擇的Kernel:


以下會針對缺失值填補的方式,特徵工程及涉及的encoding方式進行說明

缺失職填補

年齡(Age)為一項重要的特徵,由於本資料集的特性,老弱婦孺優先下船,使得年齡和和年齡高能夠有較高的存活機率,針對缺失值,我們採用的方法為使用姓名稱謂中位數進行填補

稱謂 中位數
Mr 29
Rare 47
Master 4
Miss 22
Mrs 36

根據EDA的結果,在一二艙等中我們發現小於16歲年齡的乘客有較高的存活率, 因此我們拿16歲當作切分值

特徵
age < 16 1
age > 16 0

特徵工程

特徵 處理方式 動機
Age Binning,使用16為閥值 根據EDA的結果,在一二艙等中我們發現小於16歲年齡的乘客有較高的存活率,且binning可以有效的降低overfitting
Sex 直接使用 高預測性特徵,男性多半幫助女性逃生,符合直覺
Pclass 直接使用 乘客的社會地位,艙等越高存活率越高,符合直覺
Fare Binning,按照分位數切分成4, 5, 6份比較結果 船票價格越高表示其社會地位越高,並且binning可以有效的降低overfitting
SibSp, Partch, Name 建立Family_size特徵,以及Connected-Survival特徵 連結在一起的家人從EDA中發現多半有一起存活或是一起為存活的現象,亦為一種Target Encoding

3種缺失職填補的策略比較

3 Strategies analyzing Age and Their Impact

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.