Code Monkey home page Code Monkey logo

coffee-bean-sales-analysis's Introduction

Coffee-Bean-Sales-Analysis

In this project, I analyzed Coffee Bean Sales using Python, employing libraries such as pandas, matplotlib, and seaborn to answer my questions about this data. This included determining the most popular Coffee Type, the preferred Roast Type, visualizing the top 7 most profitable cities, assessing the most profitable country, comparing the profitability of loyalty card owners to non-owners, identifying the year with the highest sales, and exploring any relationships between unit price and quantity in customer preferences.

Preprocessing

  1. Dealing with Null data
  2. Dealing with Duplicate values
  3. Removing unnecessary data
  4. Generate new features based on previous features

Goals

  1. Which Coffee Type is sold more?
  2. Which Roast Type is sold more?
  3. Plot the top 7 most profitable cities.
  4. Which country is more profitable?
  5. Which group is more profitable, loyalty card owners, or non-owners?
  6. Which year had the highest sales?
  7. Is there any relation between unit price and quantity? Are customers more inclined towards cheaper or more expensive products?

Data

The detailed data was collected through https://www.kaggle.com/datasets/saadharoon27/coffee-bean-sales-raw-dataset/data?select=Raw+Data.xlsx.
As you can see below he original dataset was a excel sheet contained 3 different sheets named orders, customers and products. orders sheet had 1000 rows and 13 columns, customers sheet had 1000 rows and 9 columns and products sheets had 48 rows and 7 columns.

Orders Dataframe

Cutomers Dataframe

Products Dataframe

Auditing data

To ensure that it meets quality standards and is fit for my intended purpose, I removed all null columns of orders dataframe. Also, I remove unnecessary data from customers and products dataframe such as Customer Name, Email, Phone Number Address Line, Size and Postcode.

Generating new feature

As the final step of data preprocessing, I created a new feature called 'Sales' by merging the 'Unit Price' column from the 'products' dataframe with the 'Quantity' column from the 'orders' dataframe and multiplying the values in the two columns.

Products_orders Dataframe before generating new feature:

Products_orders Dataframe after generating new feature:

Finally, I merged this dataframe with customers dataframe in order to use the information of sales for each city and country. The final dataframe is:

Finding patterns and insights

Which Coffee Type is sold more?

"Espresso" is the most popular coffee type with sales of $12,306.37, while "Robusta" is the least popular with sales of $9,005.16.

Which Roast Type is sold more?

Light roast is the preferred option with sales totaling $17,354.34, while dark roast is less favored with sales of $13,179.22.

Plot the top 7 most profitable cities.

"Washington" leads the way with the highest sales, totaling $1,066.91, and "Philadelphia" lags behind with sales of $511.23 among the top 7 most profitable cities.

Which country is more profitable?

The United States is the most profitable country with sales amounting to $35,638.60, while the United Kingdom ranks as the least profitable with sales of $2,798.50.

Which group is more profitable, loyalty card owners, or non-owners?

Although customers without loyalty cards tend to spend more, the total sales are nearly on par with customers who do have loyalty cards.

Which year had the highest sales?

In 2021, the highest sales were recorded at $13,766.04, while 2022 had the lowest sales at $7,063.33.

Is there any relation between unit price and quantity? Are customers more inclined towards cheaper or more expensive products?

There appears to be a modest relationship, with an increase in price leading to a decrease in quantity. However, due to the limited data, this relationship is not particularly significant.

Conclusion

In conclusion, our analysis of Coffee Bean Sales has provided valuable insights into various aspects of the data. We found that "Espresso" is the most popular coffee type, while "Robusta" is the least favored. Light roast coffee is the preferred option, with "Dark roast" following closely. The top 7 most profitable cities were led by "Washington," with "Philadelphia" at the tail end.

From a country perspective, the United States emerged as the most profitable, while the United Kingdom lagged in profitability. Surprisingly, customers without loyalty cards tend to spend more, but the total sales between loyalty card owners and non-owners are quite balanced.

We also observed that 2021 was the peak year in terms of sales, while 2022 experienced the lowest sales figures. Our exploration of the relationship between unit price and quantity suggested a modest correlation, with higher prices generally associated with decreased quantities. However, it's essential to note that this relationship may not be highly significant due to limited data. These findings offer valuable insights that can inform strategies and decision-making in the coffee sales industry.

coffee-bean-sales-analysis's People

Contributors

harrisonsrp avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.