This project involves the development of a predictive analytical solution for outlier sales detection using the Isolation Forest algorithm. The project aims to identify unusual sales patterns and provide insights for effective inventory management and business decision-making.
- Developed a predictive analytical solution using the Isolation Forest algorithm for outlier sales detection.
- Conducted univariate and bivariate analysis resulting in approximately 25% outlier detection.
- Collaborated across different teams and business units to understand requirements, including key performance indicators (KPIs) like contribution gap and sales growth.
- Communicated results to stakeholders, providing valuable insights for informed business decisions.
-
Data Loading and Preprocessing
- Loaded training and test datasets (
Train.csv
andTest.csv
). - Handled duplicate rows and transformed data to align with sales-related parameters.
- Loaded training and test datasets (
-
Outlier Detection
- Used Isolation Forest algorithm to predict outlier sales.
- Identified and visualized potential outliers using boxplots and scatter plots.
-
Data Exploration and Visualization
- Explored sales trends, product categories, and geographical sales distributions.
- Visualized correlations and relationships between variables.
-
Data Processing and Scaling
- Extracted relevant features and target variables.
- Applied MinMaxScaler to scale feature variables.
-
Isolation Forest for Outlier Detection
- Fitted Isolation Forest model to the scaled training data.
- Predicted outliers in the training dataset.
-
Tableau Dashboard (Optional)
- Created a Tableau dashboard to visually present analysis results.
- Incorporated interactive visualizations and filters for user exploration.
-
Results and Communication
- Shared insights with stakeholders, emphasizing contribution gap, sales growth, and outliers.
- Facilitated informed business decisions through clear communication of findings.
- Python 3.x
- Required Python libraries (NumPy, Pandas, Seaborn, Scikit-learn, Matplotlib)
- Clone this repository.
- Install the required Python libraries using
pip install -r requirements.txt
. - Run the provided Jupyter Notebook to perform data analysis and outlier detection.
- Optionally, create and explore the Tableau dashboard for visualizing project results.