Data · ML · Case Study

Digital Marketing
Conversion Predictor

ML pipeline to predict which leads convert — deployed as an interactive Streamlit dashboard.

PythonScikit-learnStreamlitPlotly
0.8767 F1-Score
0.8395 Recall (Converted)
0.7339 ROC-AUC
8,000 Leads Analysed
15 Features
01

Challenge

ConvertIQ, a digital marketing agency, needed a data-driven solution to identify which leads were most likely to convert. Their sales team was spending equal effort on all leads — wasting resources on low-probability prospects while missing high-intent customers.

The core business problem: predict conversion likelihood in real time so the sales team could prioritise outreach, optimise campaign spend, and increase ROI without increasing headcount.

02

Approach

We built an end-to-end ML pipeline across three business requirements: customer behaviour analysis, conversion prediction, and campaign ROI intelligence.

  • Exploratory analysis on 8,000 leads — 16 features after cleaning
  • Statistical hypothesis validation using Pearson, Spearman, and Chi-square tests
  • Feature engineering with OrdinalEncoder and SMOTE to handle class imbalance
  • Random Forest Classifier tuned via RandomizedSearchCV across 7 hyperparameters
  • Pipeline serialised with joblib and deployed to Heroku via Streamlit

Key finding: engagement metrics (TimeOnSite, PagesPerVisit) are stronger predictors than campaign channel or ad spend. Converted leads spend 47% more time on site on average.

03

Deliverables

  • Interactive Streamlit dashboard — 6 pages covering project summary, behaviour analysis, real-time predictor, model performance, ROI analysis, and hypothesis validation
  • Conversion Predictor — input 15 lead features, get instant binary prediction + probability score
  • ML Pipeline — Random Forest model with full preprocessing, serialised with joblib
  • Statistical Analysis — 3 validated hypotheses with visualisations and business recommendations
  • ROI Dashboard — campaign spend vs revenue, average ROI per category, monthly trends
04

Results

The model exceeded both defined success criteria:

F1-Score 0.8767 Exceeded target of ≥ 0.80
Recall (Converted) 0.8395 Exceeded minimum of ≥ 0.75
ROC-AUC 0.7339 Strong class discrimination

The pipeline enables the sales team to rank leads by conversion probability, focus outreach on high-intent prospects, and reduce acquisition costs — transforming raw engagement data into actionable business intelligence.

Ready to build something like this?

Let's turn your data into direction.

Start a Project →