Code Monkey home page Code Monkey logo

nani757 / outlier-detection-and-removal Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 376 KB

An outlier is a data point that is noticeably different from the rest. They represent errors in measurement, bad data collection, or simply show variables not considered when collecting the data.An outlier is a data point that is noticeably different from the rest. They represent errors in measurement, bad data collection, or simply show variables not considered when collecting the data.

Jupyter Notebook 100.00%
cappings feature-engineering outlier-detection trimming z-score iqr-method percentile-method winsorization-technique

outlier-detection-and-removal's Introduction

Outlier-Detection-and-Removal

types of Outlier handling techniques

z score technique

credits:bit.ly/393HjIP (what is Outlierand more)

credits:bit.ly/3KTs9Dp (normal distribution data)

credits:bit.ly/3KVyDll (skwed data)

this technique helps you to find if there is any Outlier or not with the formula of image

***this is only used if the data is in normal distribution ***

iqr

The interquartile range is calculated in much the same way as the range. All you do to find it is subtract the first quartile from the third quartile: IQR = Q3 โ€“ Q1.

***this is only used if the data is in skewed data distribution ***

skewed distribution

after finding the Outlier the approaches are capping or Trimming

Capping

the capping will keep two caps in min and max values it will not extend beyond that if it extends it will replace the value with the maximum or minimum values

Trimming

in Trimming it will remove the Outlier by taking the endpoints

outlier-detection-and-removal's People

Contributors

nani757 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.