Code Monkey home page Code Monkey logo

my-python-scripts's Introduction

My-Python-Scripts

In this repository Iam making many python scripts which helps me while making any Machine Learning project

Requirements :

  • Python 3.6 or later
  • scikit-learn
  • numpy
  • pandas
  • SciPy
  • matplotlib
  • seaborn
  • Tensorflow

Scripts:

  • EDA
  • Test
  • Preprocessing
  • MachineLearning
  • ImageDataProcessor
  • NeuralNet (This file is not complete yet)


EDA :

Contains three classes for now which are : (ColorPalette, DataVisualizer, DataExplorer)

ColorPalette Class

A class for creating color palettes.

  • create_sequential_palette(num_colors): Creates a sequential color palette with the specified number of colors, using a base color with the specified hue, saturation, and value_start.
  • create_diverging_palette(num_colors, value): Creates a diverging color palette with the specified number of colors, ranging from start_hue to end_hue.
  • get_color(): Given a list of colors, returns the last color in the list.

DataVisualizer

A class for Visualizing data.

  • plot_distribution: Generates distribution plots (boxplot and KDE or countplot) of selected columns of a data frame based on the number of unique values in the selected column.

  • plot_feature_by_target: Generates scatterplot or barplot for selected columns of a data frame with respect to a target variable.

  • plot_bar: Generates barplot and countplot for selected categorical columns of a data frame with respect to a target variable.

  • plot_correlation: Generates a correlation matrix heatmap plot for selected numeric columns of a data frame.

  • plot_missing: Generates missing data sum heatmap plot.

  • plot_skewness: Generates a heatmap of skewness values for selected columns of a data frame.

  • plot_pie: Generates pie chart plots for categorical features with up to 6 unique values.

  • plot_time_series : Generates time series plot of all features.

DataExplorer

A class for Exploring data.

  • explore_unique_number: Prints the number of unique values in the specified columns of the data.


Test :

Contains one class for now which is :
DataChecker

DataChecker

A class to check the shapes of the training and test sets, and to check for negative values and NaN values in the target and feature variables.

  • check_shapes(): Checks if the shapes of the training and test sets match and if the number of features and targets match.

  • check_negative_values_y(): Checks if there are negative values in the target variables of the training and test sets.

  • check_nan_values_X(): Checks if there are NaN values in the feature variables of the training and test sets.

  • check_nan_values_y(): Checks if there are NaN values in the target variable of the training and test sets.


Preprocessing :

this script is usefull for making sklearn pipelines it makes it too much easier when doing the pipelines, you just use the classes you need and columns you need in it then put this classes in only 1 sklearn pipeline and in 1 step data will change into raw data ready for applying the machine learning model on it !!!

(This project contains a collection of Python classes designed to simplify the creation of data preprocessing pipelines in scikit-learn. These transformers can be used to perform common data preprocessing tasks such as selecting columns, imputing missing values, encoding categorical variables, and scaling numeric features)

Transformer Classes

ColumnSelector

A transformer class to select specified columns from a Pandas DataFrame.

ArithmeticColumnsTransformer

A transformer class to perform arithmetic operations on specified columns in a Pandas DataFrame.

DataFrameImputer

A transformer class to impute missing values in a Pandas DataFrame.

DropColumnsTransformer

A transformer class to drop specified columns from a Pandas DataFrame.

WinsorizationImpute

A transformer class to impute missing values in a Pandas DataFrame using Winsorization.

LogTransform

A transformer class to apply a log transform to a specified column in a Pandas DataFrame.

BoxCoxTransform

A transformer class to apply a Box-Cox transform to a specified column in a Pandas DataFrame.

YeoJohnsonTransform

A transformer class to apply a Yeo-Johnson transform to a specified column in a Pandas DataFrame.

LabelEncodeColumns

A transformer class to label encode specified columns in a Pandas DataFrame.

OneHotEncodeColumns

A transformer class to one-hot encode specified columns in a Pandas DataFrame.

OrdinalEncodeColumns

A transformer class to ordinal encode specified columns in a Pandas DataFrame.

BinaryEncodeColumns

A transformer class to binary encode specified columns in a Pandas DataFrame.

StandardScaleTransform

A transformer class to standardize numeric features in a Pandas DataFrame.

MinMaxScaleTransform

A transformer class to scale numeric features to a specified range in a Pandas DataFrame.

DateTimeTranformer

A transformer class for extracting days, months, years from a time serires


ImageDataProcessor :

I made this script for my computer vision tasks where it helps me to do image processing and generating data that I need and it conatains :

ImageDataHandler :

this class helps me to handle images and it has many methods...

  • download_data : helps to download images using url
  • split_data : split images folders and make new copy for training and testing
  • delete_folder : to avoid having large space, used after making new folders for traning and testing to remove the old folders

ImageDataGenerator :

this class generates images for training the model

ImagePlotter :

this class plots some of the generated images


NeuralNet

This script class is for making methods to help while making neural networks (this script is not ready yet!!!)

my-python-scripts's People

Contributors

ahmed-hereiz avatar halemogpa avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.