API Reference
Technical reference for GeoLift classes and functions.
Core Classes
GeoLiftAnalyzer
Main class for running causal inference analyses.
from geolift.analyzer import GeoLiftAnalyzer
analyzer = GeoLiftAnalyzer()
Methods
load_data(filepath, **kwargs)
Load dataset for analysis.
Parameters:
filepath(str): Path to CSV filedate_column(str, optional): Name of date column. Default: ‘date’geo_column(str, optional): Name of geographic unit column. Default: ‘geo’outcome_column(str, optional): Name of outcome variable. Default: ‘sales’
Returns: None
Example:
analyzer.load_data('data.csv', date_column='week', outcome_column='revenue')
run_analysis(**config)
Execute the complete GeoLift analysis.
Parameters:
treatment_start_date(str): Start date of treatment (YYYY-MM-DD)treatment_end_date(str): End date of treatment (YYYY-MM-DD)treatment_markets(list): List of treated market IDscontrol_markets(list, optional): List of control market IDsoutcome_column(str): Name of outcome variableinference_method(str): ‘bootstrap’, ‘placebo’, or ‘jackknife’confidence_level(float): Confidence level (0.0-1.0). Default: 0.95
Returns: GeoLiftResults object
Example:
results = analyzer.run_analysis(
treatment_start_date='2023-06-01',
treatment_end_date='2023-08-31',
treatment_markets=[502, 503],
outcome_column='sales',
inference_method='bootstrap'
)
DonorEvaluator
Class for identifying optimal control markets.
# The donor evaluator is available as a standalone script in recipes/
import sys
import os
sys.path.append(os.path.join(os.path.dirname(__file__), 'recipes'))
from donor_evaluator import DonorEvaluator
evaluator = DonorEvaluator()
Methods
evaluate_donors(treatment_markets, pre_period_start, pre_period_end, **kwargs)
Find best control markets for treatment units.
Parameters:
treatment_markets(list): List of treatment market IDspre_period_start(str): Start of pre-treatment period (YYYY-MM-DD)pre_period_end(str): End of pre-treatment period (YYYY-MM-DD)outcome_column(str): Name of outcome variablemin_correlation(float, optional): Minimum correlation threshold. Default: 0.7max_donors(int, optional): Maximum number of donors. Default: 10
Returns: DonorResults object
PowerCalculator
Class for power analysis and experimental design.
from geolift.analyzer import PowerCalculator
power_calc = PowerCalculator()
Methods
calculate_power(treatment_markets, control_markets, baseline_data, **kwargs)
Calculate statistical power and minimum detectable effect.
Parameters:
treatment_markets(list): List of treatment market IDscontrol_markets(list): List of control market IDsbaseline_data(DataFrame): Historical data for power calculationcampaign_duration_weeks(int): Planned campaign durationalpha(float, optional): Significance level. Default: 0.05power(float, optional): Desired statistical power. Default: 0.80
Returns: PowerResults object
Result Classes
GeoLiftResults
Contains analysis results and methods for interpretation.
Attributes
absolute_lift(float): Absolute treatment effectrelative_lift(float): Relative treatment effect (percentage)confidence_interval(tuple): Lower and upper confidence boundsp_value(float): Statistical significance p-valueroi(float): Return on investmentincremental_revenue(float): Total incremental revenue
Methods
summary()
Print comprehensive results summary.
plot_time_series()
Generate time series plot showing treatment vs synthetic control.
plot_lift_analysis()
Create pre/post treatment comparison plots.
export_html_report(filename, **kwargs)
Export results to HTML report.
Parameters:
filename(str): Output filenameinclude_technical_details(bool, optional): Include statistical details. Default: True
DonorResults
Contains donor evaluation results.
Attributes
best_donors(list): List of optimal control market IDsdonor_weights(dict): Weights for each donor marketcorrelation_matrix(DataFrame): Correlation between treatment and potential donors
Methods
plot_donor_map()
Visualize donor markets on geographic map.
summary()
Print donor evaluation summary.
Utility Functions
Data Validation
from geolift.data_handler import DataValidator
validator = DataValidator()
report = validator.validate_dataset(data, required_columns, date_column, geo_column)
Configuration Management
from geolift.config_manager import ConfigManager
config = ConfigManager()
config.load_from_yaml('config.yaml')
settings = config.get_analysis_config()
CLI Commands
Basic Analysis
# Run complete analysis
python -m geolift analyze \
--data data.csv \
--treatment-start 2023-06-01 \
--treatment-markets 502,503 \
--output results/
Power Analysis
# Calculate power
python -m geolift power \
--data data.csv \
--treatment-markets 502,503 \
--duration 12 \
--mde 0.05
Donor Evaluation
# Find optimal donors
python -m geolift donors \
--data data.csv \
--treatment-markets 502,503 \
--pre-start 2023-01-01 \
--pre-end 2023-05-31
Configuration Schema
YAML Configuration
# Complete configuration example
data:
filepath: "data/campaign_data.csv"
date_column: "date"
geo_column: "geo_id"
outcome_column: "sales"
treatment:
start_date: "2023-06-01"
end_date: "2023-08-31"
markets: [502, 503, 504]
analysis:
pre_period_weeks: 24
inference_method: "bootstrap"
confidence_level: 0.95
n_bootstrap: 1000
donors:
auto_select: true
min_correlation: 0.7
max_donors: 10
exclude_markets: [501, 599]
output:
directory: "results/"
formats: ["html", "csv"]
include_plots: true
Error Handling
Common Exceptions
from geolift.exceptions import (
InsufficientDataError,
InvalidConfigurationError,
AnalysisError
)
try:
results = analyzer.run_analysis(**config)
except InsufficientDataError as e:
print(f"Not enough data: {e}")
except InvalidConfigurationError as e:
print(f"Configuration error: {e}")
except AnalysisError as e:
print(f"Analysis failed: {e}")
Advanced Usage
Custom Synthetic Control
from sparsesc import fit, estimate_effects
# Direct SparseSC usage for advanced users
sc_results = fit(
features=X,
targets=Y,
treated_units=treated_units,
control_units=control_units
)
effects = estimate_effects(
sc_results,
post_treatment_data=Y_post
)
Batch Processing
# Process multiple campaigns
campaigns = [
{'name': 'Q1_Campaign', 'config': config1},
{'name': 'Q2_Campaign', 'config': config2}
]
results = {}
for campaign in campaigns:
analyzer = GeoLiftAnalyzer()
analyzer.load_data(campaign['config']['data_path'])
results[campaign['name']] = analyzer.run_analysis(**campaign['config'])
Custom Inference Methods
# Implement custom inference
class CustomInference:
def __init__(self, method='custom'):
self.method = method
def calculate_pvalues(self, effects, null_distribution):
# Custom p-value calculation
return p_values
# Use with analyzer
analyzer.set_inference_method(CustomInference())
For more examples and advanced usage patterns, see Advanced Topics.