- Spark 1.3.1
com.databricks:spark-csv_2.10:1.2.0
Spark package (included with Spark, no need to install, seePYSPARK_SUBMIT_ARGS
)- Python 2.7
- Git Clone the project and Download Spark 1.3.1 and extract it into this project directory:
git clone https://github.com/mozilla/fxa-retention-metrics.git
cd fxa-retention-metrics
make install
- Run
source ./local/bin/activate
in your terminal - Run the csv script to generate random data:
python tools/generate_mock_csv.py
- Run one of the
metrics/
scripts to test the graphs on sample data:
python metrics/retention_events_signed.py
- Work on new metrics scripts in
/metrics
, once ready create a new conversion script under/books
for your new metrics script. - Run the conversion script, e.g
python books/retention_events_signed.py
- Upload your new
.ipynb
to a local Spark UI for testing or telemetry-dash.mozilla.org
Rebuild the books with changes made to files in /ipynb_generators
.
Test the scripts.
Installs Spark and does other things to setup the project.
Runs Spark locally to manually test if books in /ipynb/dev/*
work.
After Spark loads you will be able to navigate to Spark Web UI, navigate to /ipynb/dev
using your browser.
Open the notebook and run (>|) through all the cells to get a graph:
Gist Source: https://gist.github.com/vladikoff/9d2df4558299cf9c1795