Code Monkey home page Code Monkey logo

mysql-demo's Introduction

Areas of Interest

I am trying to instantiate the code and concept written in this article
Borrowing from the concepts, I assume these five areas would be most important as the outcome of e-commerce analysis.

  • GROWTH
  • RETENTION
  • STOCK MANAGEMENT
  • ENGAGEMENT
  • PRICING

While lacking experience in the field, I opt to demonstrate some skills in Python and SQL that may make me stand out a little.
I will only be doing a little code that will show customer retention rate.

Data Generation

For the purpose of demo-ing the SQL codes I will be generating fake user activity data using Python.

It is assumed that each user will initially sign-up for goods / services at a certain date, before the subsequent sign-ins and purchase activities.

Data between 2017-01-01 and 2017-12-31 will be generated.

Encoding

Code Activity
0 sign-up
1 sign-in
2 purchase

Action

see the detailed python code and result here.

Loading Data Into MySQL

CREATE TABLE daily_activity (
	`index` INT,
	`CustomerId` INT,
	`ActivityDate` DATE,
	`ActivityType` INT
)
LOAD DATA INFILE "C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/data.csv" INTO TABLE daily_activity
	FIELDS TERMINATED BY ','
	IGNORE 1 LINES;

For some reason there are a few missing CustomerID, but we will let that go for the sake of time.

>> SELECT count(distinct CustomerID) FROM daily_activity
9889

Retention Studies

suppose we are using a table named dim_customer to keep track of each customer's acquisition date.

suppose we are continuously upsert-ing into dim_customer (which contains the AcquisitionDate), from the daily_activity table
we could use the following code:

CREATE TABLE dim_customer (
	`CustomerId` INT KEY,
	`AcquisitionDate` DATE
)
delimiter //

CREATE DEFINER=`root`@`%` PROCEDURE `update_dim_customer`(`date` DATE)
BEGIN
	INSERT IGNORE INTO dim_customer
	SELECT
	 COALESCE(dc.CustomerId, ac.CustomerId) AS CustomerId, 
	 ac.ActivityDate AS AcquisitionDate
	FROM dim_customer dc
	RIGHT JOIN (
	 SELECT 
	  CustomerId,
	  ActivityDate
	 FROM daily_activity 
	 WHERE 
	  ActivityType = 0
	  AND ActivityDate = `date`
	 GROUP BY 1, 2
	) ac 
	 ON dc.CustomerId = ac.CustomerId
	GROUP BY 1, 2;
END
delimiter //

CREATE PROCEDURE doiterate()
BEGIN
  SET @running_date = '2017-01-01';
  SET @num = 0;
  label1: LOOP
    SET @num = @num + 1;
	SET @running_date = DATE_ADD(@running_date, INTERVAL 1 DAY);
	CALL update_dim_customer(@running_date);
    IF @num < 365 THEN
      ITERATE label1;
    END IF;
    LEAVE label1;
  END LOOP label1;
END;

Finally, calling the doiterate function

CALL doiterate()
SELECT * FROM demo.dim_customer ORDER BY AcquisitionDate;
CustomerId AcquisitionDate
3418 2017-01-01
1641 2017-01-01
5642 2017-01-01
4307 2017-01-01
7302 2017-01-01
... ...

We have exactly one entry for each distinct CustomerID in daily_activity

>> SELECT COUNT(*) FROM demo.dim_customer;
9889

Let's run the below code for deducing the customer activity level since sign-up
by replacing <MIN_DATE> with '2017-01-01'

SELECT
	DATEDIFF(ac.ActivityDate, dc.AcquisitionDate) AS DaySinceAcquisition,
	COUNT(DISTINCT ac.CustomerId) AS D1ActiveCustomers
FROM dim_customer dc
LEFT OUTER JOIN (
	SELECT
		CustomerId, 
		ActivityDate
	FROM daily_activity
	WHERE 
		ActivityType IN (0, 1)
		AND ActivityDate >= '<MIN_DATE>'
	GROUP BY 1,2
) ac 
	ON dc.CustomerId = ac.CustomerId 
	AND dc.AcquisitionDate <= ac.ActivityDate
WHERE 
	dc.AcquisitionDate >= '<MIN_DATE>'
GROUP BY 1
DaySinceAcquisition D1ActiveCustomers
0 9889
1 665
2 673
3 708
4 700
5 739

Now let's see the decay pattern, which we expect to be constant (uniformly distributed and not really decaying)
And it looks ugly but is exactly what we expected. In real life, it should display an actual decaying pattern.

mysql-demo's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.