Hi there 👋, my name is Anirban Sen
Data/Applied Scientist with 6+ years of experience working with Fortune 500 clients across e-Commerce, Retail and Technology industries applying Machine Learning, Deep Leaning (Natural Language Processing and Computer Vision) capabilities to solve and automate complex business problems & support data driven decision making.
Projects :
- Visual Search for Zappos | Tools used: CNN, PyTorch, AWS, Gradio (Ongoing)
- Created capability of Image Search for the website using finetuned CLIP model for classifying the uploaded image and feature extraction of product images with Recall@10 of ~70% and using ElasticSearch indexes to return similar images in real-time with <2 seconds using nearest neighbour search
- Outfit Builder/Complimentary Product Recommendation Model | Tools used: CNN, Keras, AWS, Gradio
- Improved the CTR of the existing ‘Complete The Look’ recommendation widget by 25% based on an AB Test. Achieved this by developing a custom DL based outfit prediction model using ResNetv2 and SBERT as the backbone models on an outfit dataset of 45k+ curated from Open-Source and deployed using Dynamo DB
- Customer Age Group Prediction | Tools used: Python, Catboost, Keras, Attention Models, AWS
- Increased CTR on landing page banners by 30% using different versions based on age-groups. Created an end-to-end age group prediction model by extracting training labels from 3rd party tool. Used various features such as static purchase features from Zappos and Amazon, purchase sequences, and a ensemble model of Catboost and Attention-based model for purchase sequences
- Order Item Level Returns Predictions for Zappos and other channels | Tools used: Python, Catboost, SHAP, AWS
- Improved prediction F1-score by ~30% from a rule based approach to correctly modify the Pricing engine. Built a ML model using historic customer-level, item-level, cart-level and overall-returns features for 3 websites. Used Catboost and SHAP to provide the most important factor for each item to be returned
- Aspect-based Sentiment Analysis on Customer feedback data | Tools used: Python, BERT, Sagemaker
- Reduced vendor costs of $60k p.a. by building an inhouse ML model that can identify topics like Search, Check-out, Price etc for a given review as well as customer sentiment (pos, neg, neutral) for each topic present using separate BERT models with ~90% average accuracy. Also built a sub-topic classification model using Naïve Bayes
- Predictive process monitoring (Partial) | Tools used: Keras, LSTM
- Building a Generative LSTM model to predict the most likely continuation of an ongoing case and the remaining time using training set of ~1M case sequences of servicing orders
- Multi-Phase Time to Complete Classification model | Tools used: Python, NLTK, TF-IDF, Catboost, SHAP
- Reduced the project cycle-time for the deployment of IT infrastructure by ~15% by building an e2e pipeline to predict cases that can take more than the expected time, sent email alerts to PMs, and suggested the next best actions based on SHAP values . Streamlined processes for monitoring, retraining, logging and testing .
- Customer Segmentation | Tools used: Python, KMeans Clustering, Silhouette Scoring, SQL
- Increased spend of targeted customers by ~7% by assisting business to design personalized promotions using customer segmentation based on RFM and Product Preferences and replacing existing nationwide promotions
- B2B Turnover Forecasting | Tools used: Python, Linear Regression, Random Forest, SQL
- Built explainable forecasting model to quantify the impact of various controllable & uncontrollable features for each forecast keeping the CV error at below ~5% using regression models with 4 years of daily sales data.
- Lookalike Targeting | Tools used: Python, XGBoost, SHAP, SQL
- Increased the response rate for a nationwide campaign by 12% by building a purchase propensity (probability scoring) model to efficiently target only the most probable customers among who haven't bought
- Collaborative filtering based Recommendation engine | Tools used: Apache Spark MLlib, Java, MySQL
- Built a Collaborative filtering based Recommendation Engine with an RMSE of ~1 on predicting user ratings in the range [1,5] using ALS Algorithm from Apache Spark MLLib on MovieLens 1M movie rating dataset
Skills: Skills: Python, SQL, Machine Learning, NLP, Computer Vision, Scikit Learn, Keras, Pytorch, Apache Spark MLlib, MLOps, Model Interpretation
- 🔭 I’m currently working on this page.