User Guide
This guide walks you through the complete GeoLift workflow for measuring marketing campaign effectiveness using causal inference.
Business Value & Positioning
What Problem We Solve
GeoLift answers the most critical marketing question: “How much incremental revenue did my campaign actually generate?” It separates true campaign impact from natural market fluctuations using rigorous causal inference.
Why This Matters for Your Business
Prove ROI: Get statistical confidence (not just correlation) in your marketing returns
Optimize Spend: Identify which regional campaigns deliver the best incremental results
Avoid False Positives: Stop attributing natural growth to your campaigns
Support Budget Decisions: Provide rigorous evidence to leadership and finance teams
Competitive Advantage: Make data-driven decisions while competitors rely on assumptions
Where GeoLift Fits in Your Measurement Stack
Primary Use Cases:
Regional advertising campaigns (TV, radio, outdoor, digital geo-targeting)
Store rollouts and expansion strategies
Local market tests and pilot programs
Geo-targeted promotional campaigns
Complements (doesn’t replace):
Media Mix Modeling (MMM): GeoLift provides ground truth for MMM calibration
Attribution: Adds causal rigor to last-touch and multi-touch attribution
A/B Testing: Use when randomized testing isn’t feasible
Brand Studies: Provides sales impact to complement brand awareness metrics
Integration Points:
Exports to your existing BI tools (Tableau, Power BI, etc.)
API integration with analytics platforms
Compatible with Google Analytics, Adobe Analytics data
Works with your existing data warehouse
Investment & ROI
Implementation:
Timeline: 2-4 weeks from contract to first analysis
Resources: 1 analyst + IT support for data integration
Training: 2-day workshop for your team
Expected Returns:
Immediate: Stop wasting budget on ineffective campaigns (typically 10-30% savings)
Short-term: Optimize active campaigns for better performance (15-25% lift improvement)
Long-term: Build institutional knowledge for better campaign planning (compound returns)
Typical ROI: 5-10x within first year through improved optimization
Overview
GeoLift measures the true sales lift and ROI from your regional marketing campaigns using advanced causal inference methods. It follows a proven 3-step workflow to ensure reliable results.
The 3-Step Workflow
Step 1: Find a Fair Comparison (Donor Evaluation)
Before measuring impact, we need to identify control markets that behave similarly to your test markets.
Data Requirements
Your dataset should include:
Time Series Data: At least 12-24 weeks of pre-campaign data
Geographic Units: Markets, DMAs, states, or regions
Outcome Metrics: Sales, conversions, or other KPIs
Treatment Assignment: Which markets received the campaign
Running Donor Evaluation
# Use the donor evaluator from recipes
import sys
import os
sys.path.append(os.path.join(os.path.dirname(__file__), 'recipes'))
from donor_evaluator import DonorEvaluator
evaluator = DonorEvaluator()
evaluator.load_data('campaign_data.csv')
# Find best control markets
donor_results = evaluator.evaluate_donors(
treatment_markets=[502, 503, 504],
pre_period_start='2023-01-01',
pre_period_end='2023-05-31',
outcome_column='sales'
)
# Review donor quality
print(donor_results.summary())
donor_results.plot_donor_map()
What to Look For
High Correlation: Control markets should track closely with treatment markets pre-campaign
Similar Trends: Parallel movement in the pre-period
Geographic Diversity: Avoid clustering all controls in one region
Step 2: Check if the Test is Strong Enough (Power Analysis)
Power analysis determines if your experiment can detect meaningful effects.
from geolift.analyzer import PowerCalculator
power_calc = PowerCalculator()
# Calculate minimum detectable effect
power_results = power_calc.calculate_power(
treatment_markets=[502, 503, 504],
control_markets=donor_results.best_donors,
baseline_data=your_data,
campaign_duration_weeks=12,
alpha=0.05, # Significance level
power=0.80 # Desired power
)
print(f"Minimum Detectable Effect: {power_results.mde:.1%}")
print(f"Recommended Campaign Duration: {power_results.min_duration} weeks")
Power Analysis Outputs
Minimum Detectable Effect (MDE): Smallest lift you can reliably detect
Recommended Duration: How long to run the campaign for reliable results
Sample Size Requirements: Number of markets needed
Step 3: Measure the Lift (GeoLift Analysis)
Run the main causal inference analysis to measure campaign impact.
from geolift.analyzer import GeoLiftAnalyzer
analyzer = GeoLiftAnalyzer()
analyzer.load_data('campaign_data.csv')
# Configure analysis
config = {
'treatment_start_date': '2023-06-01',
'treatment_end_date': '2023-08-31',
'treatment_markets': [502, 503, 504],
'control_markets': donor_results.best_donors,
'outcome_column': 'sales',
'inference_method': 'bootstrap', # or 'placebo', 'jackknife'
'confidence_level': 0.95
}
# Run analysis
results = analyzer.run_analysis(**config)
Understanding Your Results
Key Metrics Explained
Causal Impact
Absolute Lift: Raw units of incremental impact
Relative Lift: Percentage increase over baseline
Confidence Intervals: Range of plausible effect sizes
Statistical Significance
P-value: Probability results occurred by chance
Confidence Level: How certain we are about the effect
Statistical Power: Ability to detect true effects
Business Impact
Incremental ROI: Return on marketing investment
Cost Per Incremental Unit: Efficiency of campaign spend
Payback Period: Time to recover campaign investment
Interpreting Results
# Print comprehensive summary
print(results.summary())
# Key business metrics
print(f"Campaign generated {results.absolute_lift:,.0f} incremental units")
print(f"Relative lift of {results.relative_lift:.1%}")
print(f"ROI of {results.roi:.1f}x")
print(f"P-value: {results.p_value:.3f}")
Visual Diagnostics
Generate plots to validate your analysis:
# Time series plot showing treatment vs synthetic control
results.plot_time_series()
# Pre/post comparison
results.plot_lift_analysis()
# Diagnostic plots for model validation
results.plot_diagnostics()
# Geographic visualization
results.plot_geo_map()
Data Preparation Best Practices
Data Quality Requirements
Completeness: No missing values in key periods
Consistency: Same measurement methodology throughout
Granularity: Weekly or daily data preferred over monthly
Baseline Period: At least 12 weeks of pre-campaign data
Common Data Issues
Seasonality: Account for holidays and seasonal patterns
External Events: Note major market disruptions
Data Breaks: Ensure consistent measurement methodology
Outliers: Identify and handle extreme values appropriately
Data Validation
from geolift.data_handler import DataValidator
validator = DataValidator()
validation_report = validator.validate_dataset(
data=your_data,
required_columns=['date', 'geo', 'sales', 'treatment'],
date_column='date',
geo_column='geo'
)
print(validation_report.summary())
Configuration Options
Analysis Parameters
# config.yaml
analysis:
treatment_start_date: "2023-06-01"
treatment_end_date: "2023-08-31"
pre_period_weeks: 24
outcome_column: "sales"
inference:
method: "bootstrap" # bootstrap, placebo, jackknife
n_bootstrap: 1000
confidence_level: 0.95
validation:
min_pre_period_weeks: 12
max_missing_data_pct: 0.05
outlier_threshold: 3.0
Advanced Options
# Custom donor selection
analyzer.set_custom_donors(
donor_markets=[501, 505, 506, 507],
donor_weights=[0.4, 0.3, 0.2, 0.1]
)
# Multiple treatment cohorts
analyzer.analyze_multiple_cohorts(
cohort_1={'markets': [502, 503], 'start_date': '2023-06-01'},
cohort_2={'markets': [504, 505], 'start_date': '2023-07-01'}
)
Reporting and Export
Generate Business Reports
# HTML report for stakeholders
results.export_html_report(
filename='campaign_results.html',
include_technical_details=False
)
# Detailed CSV export for analysis
results.export_csv_report('detailed_results.csv')
# Executive summary
results.export_executive_summary('exec_summary.pdf')
Custom Reporting
# Create custom summary
summary_data = {
'campaign_name': 'Q3 Brand Campaign',
'lift_estimate': results.absolute_lift,
'lift_ci_lower': results.confidence_interval[0],
'lift_ci_upper': results.confidence_interval[1],
'p_value': results.p_value,
'roi': results.roi,
'campaign_cost': 50000,
'incremental_revenue': results.incremental_revenue
}
# Export to your preferred format
import pandas as pd
pd.DataFrame([summary_data]).to_csv('campaign_summary.csv')
Troubleshooting Common Issues
Poor Pre-Period Fit
Problem: Synthetic control doesn’t match treatment markets well before campaign Solutions:
Extend pre-period data
Remove outlier periods
Try different donor selection criteria
Low Statistical Power
Problem: Cannot detect meaningful effects Solutions:
Extend campaign duration
Include more treatment markets
Use more sensitive outcome metrics
Implausible Results
Problem: Effect sizes seem too large or small Solutions:
Check data quality and definitions
Validate treatment assignment
Review external factors during campaign period
Next Steps
Need technical details? → See API Reference
Want to understand the math? → Check Advanced Topics
Having specific issues? → Review FAQ
Ready for production? → See deployment guides in Advanced Topics