Code Monkey home page Code Monkey logo

pymc-marketing's Introduction

PyMC-Marketing Logo


Build codecov Ruff docs PyPI Version PyPI - Downloads License

PyMC-Marketing: Bayesian Marketing Mix Modeling (MMM) & Customer Lifetime Value (CLV)

Marketing Analytics Tools from PyMC Labs

Unlock the power of Marketing Mix Modeling (MMM) and Customer Lifetime Value (CLV) analytics with PyMC-Marketing. This open-source marketing analytics tool empowers businesses to make smarter, data-driven decisions for maximizing ROI in marketing campaigns.


This repository is supported by PyMC Labs.

For businesses looking to integrate PyMC-Marketing into their operational framework, PyMC Labs offers expert consulting and training. Our team is proficient in state-of-the-art Bayesian modeling techniques, with a focus on Marketing Mix Models (MMMs) and Customer Lifetime Value (CLV). For more information see here.

Explore these topics further by watching our video on Bayesian Marketing Mix Models: State of the Art.

Community Resources

Quick Installation Guide for Marketing Mix Modeling (MMM) & CLV

To dive into MMM and CLV analytics, set up a specialized environment, marketing_env, via conda-forge:

conda create -c conda-forge -n marketing_env pymc-marketing
conda activate marketing_env

For a comprehensive installation guide, refer to the official PyMC installation documentation.

Docker

We provide a Dockerfile to build a Docker image for PyMC-Marketing so that is accessible from a Jupyter Notebook. See here for more details.

In-depth Bayesian Marketing Mix Modeling (MMM) in PyMC

Leverage our Bayesian MMM API to tailor your marketing strategies effectively. Based on the research Jin, Yuxue, et al. “Bayesian methods for media mix modeling with carryover and shape effects.” (2017), and integrating the expertise from core PyMC developers, our API provides:

  • Custom Priors and Likelihoods: Tailor your model to your specific business needs by including domain knowledge via prior distributions.
  • Adstock Transformation: Optimize the carry-over effects in your marketing channels.
  • Saturation Effects: Understand the diminishing returns in media investments.
  • Time-varying Intercept: Capture time-varying baseline contributions in your model (using modern and efficient Gaussian processes approximation methods).
  • Visualization and Model Diagnostics: Get a comprehensive view of your model's performance and insights.
  • Out-of-sample Predictions: Forecast future marketing performance with credible intervals. Use this for simulations and scenario planning.
  • Budget Optimization: Allocate your marketing spend efficiently across various channels for maximum ROI.
  • Experiment Calibration: Fine-tune your model based on empirical experiments for a more unified view of marketing.

MMM Quickstart

import pandas as pd
from pymc_marketing.mmm import DelayedSaturatedMMM

data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/data/mmm_example.csv"
data = pd.read_csv(data_url, parse_dates=['date_week'])

mmm = DelayedSaturatedMMM(
    date_column="date_week",
    channel_columns=["x1", "x2"],
    control_columns=[
        "event_1",
        "event_2",
        "t",
    ],
    adstock_max_lag=8,
    yearly_seasonality=2,
)

Initiate fitting and get a visualization of some of the outputs with:

X = data.drop("y",axis=1)
y = data["y"]
mmm.fit(X,y)
mmm.plot_components_contributions();

Once the model is fitted, we can further optimize our budget allocation as we are including diminishing returns and carry-over effects in our model.

Explore a hands-on simulated example for more insights into MMM with PyMC-Marketing.

Essential Reading for Marketing Mix Modeling (MMM)

Unlock Customer Lifetime Value (CLV) with PyMC

Understand and optimize your customer's value with our CLV models. Our API supports various types of CLV models, catering to both contractual and non-contractual settings, as well as continuous and discrete transaction modes.

Explore our detailed CLV examples using data from the lifetimes package:

Examples

Non-contractual Contractual
Continuous Buying groceries Audible
Discrete Cinema ticket Monthly or yearly subscriptions

CLV Quickstart

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from pymc_marketing import clv

data_url = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/data/clv_quickstart.csv"
data = pd.read_csv(data_url)
data["customer_id"] = data.index

beta_geo_model = clv.BetaGeoModel(data=data)

beta_geo_model.fit()

Once fitted, we can use the model to predict the number of future purchases for known customers, the probability that they are still alive, and get various visualizations plotted.

See the Examples section for more on this.

Why PyMC-Marketing vs other solutions?

PyMC-Marketing is and will always be free for commercial use, licensed under Apache 2.0. Developed by core developers behind the popular PyMC package and marketing experts, it provides state-of-the-art measurements and analytics for marketing teams.

Due to its open-source nature and active contributor base, new features are constantly added. Are you missing a feature or want to contribute? Fork our repository and submit a pull request. If you have any questions, feel free to open an issue.

Thanks to our contributors!

https://github.com/pymc-devs/pymc/graphs/contributors

Marketing AI Assistant: MMM-GPT with PyMC-Marketing

Not sure how to start or have questions? MMM-GPT is an AI that answers questions and provides expert advice on marketing analytics using PyMC-Marketing.

Try MMM-GPT here.

📞 Schedule a Free Consultation for MMM & CLV Strategy

Maximize your marketing ROI with a free 30-minute strategy session with our PyMC-Marketing experts. Learn how Bayesian Marketing Mix Modeling and Customer Lifetime Value analytics can boost your organization by making smarter, data-driven decisions.

We provide the following professional services:

  • Custom Models: We tailor niche marketing analytics models to fit your organization's unique needs.
  • Build Within PyMC-Marketing: Our team members are experts leveraging the capabilities of PyMC-Marketing to create robust marketing models for precise insights.
  • SLA & Coaching: Get guaranteed support levels and personalized coaching to ensure your team is well-equipped and confident in using our tools and approaches.
  • SaaS Solutions: Harness the power of our state-of-the-art software solutions to streamline your data-driven marketing initiatives.

pymc-marketing's People

Contributors

abdalazizrashid avatar ameynen avatar cetagostini avatar cetagostini-wise avatar cluhmann avatar coltallen avatar drbenvincent avatar ferrine avatar garve avatar juanitorduz avatar konkinit avatar larryshamalama avatar louismagowan avatar lucianopaz avatar maresb avatar markussagen avatar michaelraczycki avatar mustaphau avatar nialloulton avatar oriolabril avatar pre-commit-ci[bot] avatar ricardov94 avatar sangamswadik avatar takechanman1228 avatar tomicapretto avatar twiecki avatar ulfaslak avatar vincent-grosbois avatar wd60622 avatar xhulianothe1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymc-marketing's Issues

Add `setup.py`

Add setup.py so that we can install the python package via python -m pip install -e .

Add delayed adstock function

Add delayed adstock function from Jin, Yuxue, et al. "Bayesian methods for media mix modeling with carryover and shape effects." (2017).

add black[jupyter] to the environment

Now we have notebooks, it would be useful to add black[jupyter] to the environment. I have no understanding of how to play with commit hooks, but maybe it's relevant to add in there too?

Should unit tests live outside of the package?

We are shipping our the tests folder inside of the pymmmc package. I don't think is is necessary or desirable. I propose that we move the tests folder to the root directory instead.

Current organization:
root/
└─> pymmmc/
└─> tests/

Proposed:
root/
├─> pymmmc/
└─> tests/

Add study case with MMM + CLV

Create some fake dataset with continuous non-contractual process and build a story around:

  1. Using MMM to infer cost of acquisition across different channels
  2. Using CLV to infer differential lifetime value of customers coming from different channels
  3. Making business decision that takes the two sources of information into account
    a. Preferring to invest in a channel with higher CAC because of higher CLV
    b. Binary decision to not further invest in channel if CLV is lower than CAC

Requires: #24, #19, #39

We might use this case-study for the initial announcement of the package

MMM `adstock_max_lag` and `control_data` are model specific

If we intend our MMM class to be a suitable generic base for many media mix models, we should have as generic arguments to __init__ as possible. adstock_max_lag is completely specific to a particular way to apply a convolution operation on a vector, and control_data might be handled differently by different MMM subclasses (e.g. it could handle continuous controls differently than categorical controls). We should remove as much model specific arguments from the __init__ signature, and leave the rest as kwargs that are forwarded into _build_model

Decide how to handle `date_column` dtype and transformations

At the moment, the date_column is left untouched in the data. This might be ok, but it would be great to support datetime dtypes as inputs. Those can be easily handled in plots of time series, and they are more natural to reason about as opposed to floating point arrays referenced to some onset event and using some unknown resolution in some unknown time zone. The difficulty of supporting datetime dtypes is that we need to:

  1. Ensure that the date_column is a date time compatible dtype
  2. Convert it to a datetime if it isn't using some string format?
  3. Add a transformation to go from datetimes to floating point arrays and back. This is necessary if we want to have time series with seasonal components or other mathematical dependencies on time.

MMM Example Notebook: Time Varying Coefficients

To prove the flexibility and potential of framework (namely, bayesian stats and pymc), we would like to have an example notebook to illustrate how to extend the base model (based on simulated data) introduced in #41 by allowing time varying coefficients via gaussian processes.

Marginalization over dichotomous variable

I know that this is ongoing work raised in issue 21 of AePPL, but is there a quick way to marginalize over a discrete parameter? Akin to the example provided in the issue linked above:

$$p(Y=y | X=0) * p(X=0) + p(Y=y | X=1) * p(X=1)$$

for some continuous $Y$ and dichotomous $X$. In this case, $X$ would be the variable indicating a customer churning or not and the difference between a contractual and non-contractual likelihood would be that one is the marginalized version of the other. I have not revisited the math in several weeks, but I'm fairly certain of this and the ability to marginalize likelihoods as such may provide better model building blocks rather than define a distribution class for each quadrant. Just an idea so far...

Add study case with alternative to gamma-gamma model

If we don't summarize individual transaction values, there should be much more flexibility in how to model user latent "spend", with e.g, timeseries component, glm predictors, ....

Would be nice to add a study case of such, perhaps motivating new summary/plotting/prediction functionality of the library.

Implement pre-built `BG/NBD`/`BetaGeoFitter` model

It would be good to add a BetaGeoFitter function that returns a ContNonContract with some default priors. A signature that resembles what is provided in the lifetimes package would be a good idea. Something along the lines of the following snipet.

def BetaGeoFitter(name, a, b, r, alpha, T, T0, *, observed, **kwargs):
    p = pm.Beta(f"{name}_beta", a, b, size=size, shape=shape)
    lam = pm.Gamma(f"{name}_gamma", r, 1/alpha, size=size, shape=shape)
    return ContNonContract(name, lam, p, T, T0, size=size, shape=shape, **kwargs)

We should also add some useful summary stats / plots. If they are not specific to the BG/NBD the better!

Explore variants of the shifted Beta Geometric model

From: Fader, P. S., & Hardie, B. G. (2007). How to project customer retention. Journal of Interactive Marketing, 21(1), 76-90. pdf

They mention this other model derived from the beta-binomial, which is conceptually equivalent:

Their model is based on assumptions simi-
lar to those behind the sBG model: (a) Each person
responds to a direct-mail solicitation with constant
probability p, and (b) p varies across the population
according to a beta distribution. While BM base their
framework on the beta-binomial model, it could have
been derived as an sBG model (e.g., the mailing on
which the prospect responds to the offer is character-
ized by the shifted-geometric distribution). As such, it
is possible to identify clear relationships between
some of the results in this article [e.g., rt and S(t)] and
some quantities of interest in a list-falloff setting.

Then extensions with cohort covariates:

The BM framework was extended by Rao and Steckel
(1995) to incorporate (time-invariant) descriptor
variables such as age, income, and sex. This is accom-
plished using the beta-logistic model (Heckman &
Willis, 1977),

Incorporating the effects of time-
varying covariates (e.g., marketing-mix effects, sea-
sonality) is more complicated. The key is to bring in
all of these factors at the right level; that is, at the
level of the latent parameter of interest (in this case,
�) instead of just “jamming” different covariate effects
into a regression-like model (see Schweidel, Fader, &
Bradlow, 2006, for a discussion of how to do this in a
continuous-time contractual setting.)

And extensions with time effets:

Both the sBG model and its continuous-time analog
(i.e., the EG model) are based on the assumption that
the commonly observed phenomenon of increasing
retention rates is due entirely to heterogeneity;
individual-customer-level retention rates are assumed
to be constant. If we wish to allow for the possibility of
time dynamics at the level of the individual customer,
we can no longer characterize the duration of an indi-
vidual’s relationship with the firm using either the
shifted-geometric or exponential distributions, both of
which have the “memoryless” property (i.e., the proba-
bility of survival to s � t, given survival to t , is the
same as the initial probability of survival to s ). In a
continuous-time setting, we can accommodate this
effect by assuming that individual lifetimes can be
characterized by the Weibull distribution, which allows
for an individual’s risk of canceling a contract to
increase or decrease as the length of the relationship
with the firm increases. In a discrete-time contractual
setting, this leads to the beta-discrete-Weibull (BdW)
model (Fader & Hardie, 2006), which is a generaliza-
tion of the sBG model, while in a continuous-time con-
tractual setting, this leads to a generalization of the EG
model, the Weibull-gamma (WG) model (Hardie et al.,
1998; Morrison & Schmittlein, 1980).

Add ROAS plot

Add a method to compute the return of ad spend for certain channels and plot it. The plot could follow the style of figure 3 from Jin et al 2017, which I'll copy down here just as a reference.

image

New default sampler for `ContNonContract` when there are no observations

Currently, the intention is to primarily use ContNonContract to perform inference on observational data. However, without the observed= keyword, our samplers will misbehave as value = [t_x, x] for t_x being the time of the xth observation with x being an integer.

A moment method would be beneficial, but careful thought must be put into the sampler.

Adstock transformation without `for` loop

We would like to write a more efficient implementation of the adstock transformations so that we do not use a for loop. An attempt with scan was (unsuccessfully) implemented in #15

Requirements:

  • We should be able to add the l_max parameter to truncate the size of the effect.
  • Be vectorised
  • Should be faster than the current implementation.

Take the CLV grid to the next level

The idea is to follow up to #25, and extend the basic models in interesting ways, e.g., by adding hierarchical effects, time-varying effects... and so on. This should give us a more refined idea not only of what building blocks we need, but how flexible they should be.

This will, potentially, also be the biggest selling point of the package, as we will be doing things that are not really done out there (or at least not published in neat papers / packages), in large part by taking advantage of working with a fully-fledged PPL (PyMC!)

image

Unlike #25, these squares are not yet fixed, and any cool idea you have can be used.

  • Continuous Non-contractual + Hierachical structure (#39) (up for grabs)
  • Continuous Contractual + ??? (up for grabs)
  • Discrete Non-contractual + ??? (up for grabs)
  • Discrete Contractual + cohort / temporal effects (suggested to @drbenvincent), see #35

Possible the grid won't be over the 4 types of models, but perhaps over extensions:

  • Cohort + Temporal effects on lifetime
  • Cohort + Temporal effects on value
  • Complex interactions between Lifetime and Value components
    • E.g., subscription fee affects churn-rate and value, @juanitorduz brought something that resonates with this
  • THE NEXT BANG IN MARKETING MODELS ???

Add notebooks to fill the basic CLV grid

The idea is to write a notebook with pure PyMC model(s) for each of these CLV scenarios. We can start with the Lifetime part (not-value yet), but ideally we will include value as well by the end. This might be a constant with time-decay penalty in the simplest cases.

Hopefully this will give us a good picture of the building blocks that are necessary for a minimum viable package, and can also serve as the base documentation. Overtime we would replace the custom PyMC code with imports from the CLV sub-package.

image

  • Continuous Non-contractual #16
  • Continuous Contractual #36
  • Discrete Non-contractual (up for grabs)
  • Discrete Contractual #32

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.