API Reference

Technical reference for GeoLift classes and functions.

Core Classes

GeoLiftAnalyzer

Main class for running causal inference analyses.

from geolift.analyzer import GeoLiftAnalyzer

analyzer = GeoLiftAnalyzer()

Methods

load_data(filepath, **kwargs)

Load dataset for analysis.

Parameters:

  • filepath (str): Path to CSV file

  • date_column (str, optional): Name of date column. Default: ‘date’

  • geo_column (str, optional): Name of geographic unit column. Default: ‘geo’

  • outcome_column (str, optional): Name of outcome variable. Default: ‘sales’

Returns: None

Example:

analyzer.load_data('data.csv', date_column='week', outcome_column='revenue')
run_analysis(**config)

Execute the complete GeoLift analysis.

Parameters:

  • treatment_start_date (str): Start date of treatment (YYYY-MM-DD)

  • treatment_end_date (str): End date of treatment (YYYY-MM-DD)

  • treatment_markets (list): List of treated market IDs

  • control_markets (list, optional): List of control market IDs

  • outcome_column (str): Name of outcome variable

  • inference_method (str): ‘bootstrap’, ‘placebo’, or ‘jackknife’

  • confidence_level (float): Confidence level (0.0-1.0). Default: 0.95

Returns: GeoLiftResults object

Example:

results = analyzer.run_analysis(
    treatment_start_date='2023-06-01',
    treatment_end_date='2023-08-31',
    treatment_markets=[502, 503],
    outcome_column='sales',
    inference_method='bootstrap'
)

DonorEvaluator

Class for identifying optimal control markets.

# The donor evaluator is available as a standalone script in recipes/
import sys
import os
sys.path.append(os.path.join(os.path.dirname(__file__), 'recipes'))
from donor_evaluator import DonorEvaluator

evaluator = DonorEvaluator()

Methods

evaluate_donors(treatment_markets, pre_period_start, pre_period_end, **kwargs)

Find best control markets for treatment units.

Parameters:

  • treatment_markets (list): List of treatment market IDs

  • pre_period_start (str): Start of pre-treatment period (YYYY-MM-DD)

  • pre_period_end (str): End of pre-treatment period (YYYY-MM-DD)

  • outcome_column (str): Name of outcome variable

  • min_correlation (float, optional): Minimum correlation threshold. Default: 0.7

  • max_donors (int, optional): Maximum number of donors. Default: 10

Returns: DonorResults object

PowerCalculator

Class for power analysis and experimental design.

from geolift.analyzer import PowerCalculator

power_calc = PowerCalculator()

Methods

calculate_power(treatment_markets, control_markets, baseline_data, **kwargs)

Calculate statistical power and minimum detectable effect.

Parameters:

  • treatment_markets (list): List of treatment market IDs

  • control_markets (list): List of control market IDs

  • baseline_data (DataFrame): Historical data for power calculation

  • campaign_duration_weeks (int): Planned campaign duration

  • alpha (float, optional): Significance level. Default: 0.05

  • power (float, optional): Desired statistical power. Default: 0.80

Returns: PowerResults object

Result Classes

GeoLiftResults

Contains analysis results and methods for interpretation.

Attributes

  • absolute_lift (float): Absolute treatment effect

  • relative_lift (float): Relative treatment effect (percentage)

  • confidence_interval (tuple): Lower and upper confidence bounds

  • p_value (float): Statistical significance p-value

  • roi (float): Return on investment

  • incremental_revenue (float): Total incremental revenue

Methods

summary()

Print comprehensive results summary.

plot_time_series()

Generate time series plot showing treatment vs synthetic control.

plot_lift_analysis()

Create pre/post treatment comparison plots.

export_html_report(filename, **kwargs)

Export results to HTML report.

Parameters:

  • filename (str): Output filename

  • include_technical_details (bool, optional): Include statistical details. Default: True

DonorResults

Contains donor evaluation results.

Attributes

  • best_donors (list): List of optimal control market IDs

  • donor_weights (dict): Weights for each donor market

  • correlation_matrix (DataFrame): Correlation between treatment and potential donors

Methods

plot_donor_map()

Visualize donor markets on geographic map.

summary()

Print donor evaluation summary.

Utility Functions

Data Validation

from geolift.data_handler import DataValidator

validator = DataValidator()
report = validator.validate_dataset(data, required_columns, date_column, geo_column)

Configuration Management

from geolift.config_manager import ConfigManager

config = ConfigManager()
config.load_from_yaml('config.yaml')
settings = config.get_analysis_config()

CLI Commands

Basic Analysis

# Run complete analysis
python -m geolift analyze \
    --data data.csv \
    --treatment-start 2023-06-01 \
    --treatment-markets 502,503 \
    --output results/

Power Analysis

# Calculate power
python -m geolift power \
    --data data.csv \
    --treatment-markets 502,503 \
    --duration 12 \
    --mde 0.05

Donor Evaluation

# Find optimal donors
python -m geolift donors \
    --data data.csv \
    --treatment-markets 502,503 \
    --pre-start 2023-01-01 \
    --pre-end 2023-05-31

Configuration Schema

YAML Configuration

# Complete configuration example
data:
  filepath: "data/campaign_data.csv"
  date_column: "date"
  geo_column: "geo_id"
  outcome_column: "sales"

treatment:
  start_date: "2023-06-01"
  end_date: "2023-08-31"
  markets: [502, 503, 504]

analysis:
  pre_period_weeks: 24
  inference_method: "bootstrap"
  confidence_level: 0.95
  n_bootstrap: 1000

donors:
  auto_select: true
  min_correlation: 0.7
  max_donors: 10
  exclude_markets: [501, 599]

output:
  directory: "results/"
  formats: ["html", "csv"]
  include_plots: true

Error Handling

Common Exceptions

from geolift.exceptions import (
    InsufficientDataError,
    InvalidConfigurationError,
    AnalysisError
)

try:
    results = analyzer.run_analysis(**config)
except InsufficientDataError as e:
    print(f"Not enough data: {e}")
except InvalidConfigurationError as e:
    print(f"Configuration error: {e}")
except AnalysisError as e:
    print(f"Analysis failed: {e}")

Advanced Usage

Custom Synthetic Control

from sparsesc import fit, estimate_effects

# Direct SparseSC usage for advanced users
sc_results = fit(
    features=X,
    targets=Y,
    treated_units=treated_units,
    control_units=control_units
)

effects = estimate_effects(
    sc_results,
    post_treatment_data=Y_post
)

Batch Processing

# Process multiple campaigns
campaigns = [
    {'name': 'Q1_Campaign', 'config': config1},
    {'name': 'Q2_Campaign', 'config': config2}
]

results = {}
for campaign in campaigns:
    analyzer = GeoLiftAnalyzer()
    analyzer.load_data(campaign['config']['data_path'])
    results[campaign['name']] = analyzer.run_analysis(**campaign['config'])

Custom Inference Methods

# Implement custom inference
class CustomInference:
    def __init__(self, method='custom'):
        self.method = method
    
    def calculate_pvalues(self, effects, null_distribution):
        # Custom p-value calculation
        return p_values

# Use with analyzer
analyzer.set_inference_method(CustomInference())

For more examples and advanced usage patterns, see Advanced Topics.