# User Guide This guide walks you through the complete GeoLift workflow for measuring marketing campaign effectiveness using causal inference. ## Business Value & Positioning ### What Problem We Solve GeoLift answers the most critical marketing question: **"How much incremental revenue did my campaign actually generate?"** It separates true campaign impact from natural market fluctuations using rigorous causal inference. ### Why This Matters for Your Business - **Prove ROI**: Get statistical confidence (not just correlation) in your marketing returns - **Optimize Spend**: Identify which regional campaigns deliver the best incremental results - **Avoid False Positives**: Stop attributing natural growth to your campaigns - **Support Budget Decisions**: Provide rigorous evidence to leadership and finance teams - **Competitive Advantage**: Make data-driven decisions while competitors rely on assumptions ### Where GeoLift Fits in Your Measurement Stack **Primary Use Cases:** - Regional advertising campaigns (TV, radio, outdoor, digital geo-targeting) - Store rollouts and expansion strategies - Local market tests and pilot programs - Geo-targeted promotional campaigns **Complements (doesn't replace):** - **Media Mix Modeling (MMM)**: GeoLift provides ground truth for MMM calibration - **Attribution**: Adds causal rigor to last-touch and multi-touch attribution - **A/B Testing**: Use when randomized testing isn't feasible - **Brand Studies**: Provides sales impact to complement brand awareness metrics **Integration Points:** - Exports to your existing BI tools (Tableau, Power BI, etc.) - API integration with analytics platforms - Compatible with Google Analytics, Adobe Analytics data - Works with your existing data warehouse ### Investment & ROI **Implementation:** - **Timeline**: 2-4 weeks from contract to first analysis - **Resources**: 1 analyst + IT support for data integration - **Training**: 2-day workshop for your team **Expected Returns:** - **Immediate**: Stop wasting budget on ineffective campaigns (typically 10-30% savings) - **Short-term**: Optimize active campaigns for better performance (15-25% lift improvement) - **Long-term**: Build institutional knowledge for better campaign planning (compound returns) - **Typical ROI**: 5-10x within first year through improved optimization --- ## Overview GeoLift measures the true sales lift and ROI from your regional marketing campaigns using advanced causal inference methods. It follows a proven 3-step workflow to ensure reliable results. ## The 3-Step Workflow ### Step 1: Find a Fair Comparison (Donor Evaluation) Before measuring impact, we need to identify control markets that behave similarly to your test markets. #### Data Requirements Your dataset should include: - **Time Series Data**: At least 12-24 weeks of pre-campaign data - **Geographic Units**: Markets, DMAs, states, or regions - **Outcome Metrics**: Sales, conversions, or other KPIs - **Treatment Assignment**: Which markets received the campaign #### Running Donor Evaluation ```python # Use the donor evaluator from recipes import sys import os sys.path.append(os.path.join(os.path.dirname(__file__), 'recipes')) from donor_evaluator import DonorEvaluator evaluator = DonorEvaluator() evaluator.load_data('campaign_data.csv') # Find best control markets donor_results = evaluator.evaluate_donors( treatment_markets=[502, 503, 504], pre_period_start='2023-01-01', pre_period_end='2023-05-31', outcome_column='sales' ) # Review donor quality print(donor_results.summary()) donor_results.plot_donor_map() ``` #### What to Look For - **High Correlation**: Control markets should track closely with treatment markets pre-campaign - **Similar Trends**: Parallel movement in the pre-period - **Geographic Diversity**: Avoid clustering all controls in one region ### Step 2: Check if the Test is Strong Enough (Power Analysis) Power analysis determines if your experiment can detect meaningful effects. ```python from geolift.analyzer import PowerCalculator power_calc = PowerCalculator() # Calculate minimum detectable effect power_results = power_calc.calculate_power( treatment_markets=[502, 503, 504], control_markets=donor_results.best_donors, baseline_data=your_data, campaign_duration_weeks=12, alpha=0.05, # Significance level power=0.80 # Desired power ) print(f"Minimum Detectable Effect: {power_results.mde:.1%}") print(f"Recommended Campaign Duration: {power_results.min_duration} weeks") ``` #### Power Analysis Outputs - **Minimum Detectable Effect (MDE)**: Smallest lift you can reliably detect - **Recommended Duration**: How long to run the campaign for reliable results - **Sample Size Requirements**: Number of markets needed ### Step 3: Measure the Lift (GeoLift Analysis) Run the main causal inference analysis to measure campaign impact. ```python from geolift.analyzer import GeoLiftAnalyzer analyzer = GeoLiftAnalyzer() analyzer.load_data('campaign_data.csv') # Configure analysis config = { 'treatment_start_date': '2023-06-01', 'treatment_end_date': '2023-08-31', 'treatment_markets': [502, 503, 504], 'control_markets': donor_results.best_donors, 'outcome_column': 'sales', 'inference_method': 'bootstrap', # or 'placebo', 'jackknife' 'confidence_level': 0.95 } # Run analysis results = analyzer.run_analysis(**config) ``` ## Understanding Your Results ### Key Metrics Explained #### Causal Impact - **Absolute Lift**: Raw units of incremental impact - **Relative Lift**: Percentage increase over baseline - **Confidence Intervals**: Range of plausible effect sizes #### Statistical Significance - **P-value**: Probability results occurred by chance - **Confidence Level**: How certain we are about the effect - **Statistical Power**: Ability to detect true effects #### Business Impact - **Incremental ROI**: Return on marketing investment - **Cost Per Incremental Unit**: Efficiency of campaign spend - **Payback Period**: Time to recover campaign investment ### Interpreting Results ```python # Print comprehensive summary print(results.summary()) # Key business metrics print(f"Campaign generated {results.absolute_lift:,.0f} incremental units") print(f"Relative lift of {results.relative_lift:.1%}") print(f"ROI of {results.roi:.1f}x") print(f"P-value: {results.p_value:.3f}") ``` ### Visual Diagnostics Generate plots to validate your analysis: ```python # Time series plot showing treatment vs synthetic control results.plot_time_series() # Pre/post comparison results.plot_lift_analysis() # Diagnostic plots for model validation results.plot_diagnostics() # Geographic visualization results.plot_geo_map() ``` ## Data Preparation Best Practices ### Data Quality Requirements - **Completeness**: No missing values in key periods - **Consistency**: Same measurement methodology throughout - **Granularity**: Weekly or daily data preferred over monthly - **Baseline Period**: At least 12 weeks of pre-campaign data ### Common Data Issues - **Seasonality**: Account for holidays and seasonal patterns - **External Events**: Note major market disruptions - **Data Breaks**: Ensure consistent measurement methodology - **Outliers**: Identify and handle extreme values appropriately ### Data Validation ```python from geolift.data_handler import DataValidator validator = DataValidator() validation_report = validator.validate_dataset( data=your_data, required_columns=['date', 'geo', 'sales', 'treatment'], date_column='date', geo_column='geo' ) print(validation_report.summary()) ``` ## Configuration Options ### Analysis Parameters ```yaml # config.yaml analysis: treatment_start_date: "2023-06-01" treatment_end_date: "2023-08-31" pre_period_weeks: 24 outcome_column: "sales" inference: method: "bootstrap" # bootstrap, placebo, jackknife n_bootstrap: 1000 confidence_level: 0.95 validation: min_pre_period_weeks: 12 max_missing_data_pct: 0.05 outlier_threshold: 3.0 ``` ### Advanced Options ```python # Custom donor selection analyzer.set_custom_donors( donor_markets=[501, 505, 506, 507], donor_weights=[0.4, 0.3, 0.2, 0.1] ) # Multiple treatment cohorts analyzer.analyze_multiple_cohorts( cohort_1={'markets': [502, 503], 'start_date': '2023-06-01'}, cohort_2={'markets': [504, 505], 'start_date': '2023-07-01'} ) ``` ## Reporting and Export ### Generate Business Reports ```python # HTML report for stakeholders results.export_html_report( filename='campaign_results.html', include_technical_details=False ) # Detailed CSV export for analysis results.export_csv_report('detailed_results.csv') # Executive summary results.export_executive_summary('exec_summary.pdf') ``` ### Custom Reporting ```python # Create custom summary summary_data = { 'campaign_name': 'Q3 Brand Campaign', 'lift_estimate': results.absolute_lift, 'lift_ci_lower': results.confidence_interval[0], 'lift_ci_upper': results.confidence_interval[1], 'p_value': results.p_value, 'roi': results.roi, 'campaign_cost': 50000, 'incremental_revenue': results.incremental_revenue } # Export to your preferred format import pandas as pd pd.DataFrame([summary_data]).to_csv('campaign_summary.csv') ``` ## Troubleshooting Common Issues ### Poor Pre-Period Fit **Problem**: Synthetic control doesn't match treatment markets well before campaign **Solutions**: - Extend pre-period data - Remove outlier periods - Try different donor selection criteria ### Low Statistical Power **Problem**: Cannot detect meaningful effects **Solutions**: - Extend campaign duration - Include more treatment markets - Use more sensitive outcome metrics ### Implausible Results **Problem**: Effect sizes seem too large or small **Solutions**: - Check data quality and definitions - Validate treatment assignment - Review external factors during campaign period ## Next Steps - **Need technical details?** → See [API Reference](API_REFERENCE.md) - **Want to understand the math?** → Check [Advanced Topics](ADVANCED_TOPICS.md) - **Having specific issues?** → Review [FAQ](FAQ.md) - **Ready for production?** → See deployment guides in Advanced Topics