pymc-labs · cetagostini · Dec 15, 2025 · Dec 11, 2025 · Dec 11, 2025
diff --git a/.claude/skills/designing-experiments/SKILL.md b/.claude/skills/designing-experiments/SKILL.md
@@ -0,0 +1,31 @@
+---
+name: designing-experiments
+description: Selects the appropriate quasi-experimental method (DiD, ITS, SC) based on data structure and research questions. Use when the user is unsure which method to apply.
+---
+
+# Designing Experiments
+
+Helps select the appropriate causal inference method.
+
+## Decision Framework
+
+1.  **Control Group?**
+    *   **Yes**: Go to Step 2.
+    *   **No**: Consider **Interrupted Time Series (ITS)**.
+
+2.  **Unit Structure?**
+    *   **Single Treated Unit**:
+        *   With multiple controls: **Synthetic Control (SC)**.
+        *   No controls: **ITS**.
+    *   **Multiple Treated Units**:
+        *   With control group: **Difference-in-Differences (DiD)**.
+
+3.  **Time Structure?**
+    *   **Panel Data** (Multiple units over time): Required for DiD and SC.
+    *   **Time Series** (Single unit over time): Required for ITS.
+
+## Method Quick Reference
+
+*   **Difference-in-Differences (DiD)**: Compares trend changes between treated and control groups. Assumes **Parallel Trends**.
+*   **Interrupted Time Series (ITS)**: Analyzes trend/level change for a single unit after intervention. Assumes **Trend Continuity**.
+*   **Synthetic Control (SC)**: Constructs a synthetic counterfactual from weighted control units. Assumes **Convex Hull** (treated unit within range of controls).
diff --git a/.claude/skills/loading-datasets/SKILL.md b/.claude/skills/loading-datasets/SKILL.md
@@ -0,0 +1,30 @@
+---
+name: loading-datasets
+description: Loads internal CausalPy example datasets. Use when the user needs example data or asks about available demos.
+---
+
+# Loading Datasets
+
+Loads example datasets provided with CausalPy.
+
+## Usage
+
+```python
+import causalpy as cp
+df = cp.load_data("dataset_name")
+```
+
+## Available Datasets
+
+| Key | Description |
+| :--- | :--- |
+| `did` | Generic Difference-in-Differences |
+| `its` | Generic Interrupted Time Series |
+| `sc` | Generic Synthetic Control |
+| `banks` | DiD (Banks) |
+| `brexit` | Synthetic Control (Brexit) |
+| `covid` | ITS (Covid) |
+| `drinking` | Regression Discontinuity (Drinking Age) |
+| `rd` | Generic Regression Discontinuity |
+| `geolift1` | GeoLift (Single cell) |
+| `geolift_multi_cell` | GeoLift (Multi cell) |
diff --git a/.claude/skills/performing-causal-analysis/SKILL.md b/.claude/skills/performing-causal-analysis/SKILL.md
@@ -0,0 +1,28 @@
+---
+name: performing-causal-analysis
+description: Fits causal models, estimates impacts, and plots results using CausalPy. Use when performing analysis with DiD, ITS, SC, or RD.
+---
+
+# Performing Causal Analysis
+
+Executes causal analysis using CausalPy experiment classes.
+
+## Workflow
+
+1.  **Load Data**: Ensure data is in a Pandas DataFrame.
+2.  **Initialize Experiment**: Use the appropriate class (see References).
+3.  **Fit & Model**: Models are fitted automatically upon initialization if arguments are provided.
+4.  **Analyze Results**: Use `summary()`, `print_coefficients()`, and `plot()`.
+
+## Core Methods
+
+*   `experiment.summary()`: Prints model summary and main results.
+*   `experiment.plot()`: Visualizes observed vs. counterfactual.
+*   `experiment.print_coefficients()`: Shows model coefficients.
+
+## References
+
+Detailed usage for specific methods:
+*   [Difference-in-Differences](reference/diff_in_diff.md)
+*   [Interrupted Time Series](reference/interrupted_time_series.md)
+*   [Synthetic Control](reference/synthetic_control.md)
diff --git a/.claude/skills/performing-causal-analysis/reference/diff_in_diff.md b/.claude/skills/performing-causal-analysis/reference/diff_in_diff.md
@@ -0,0 +1,50 @@
+# Causal Difference-in-Differences (DiD)
+
+Difference-in-Differences (DiD) estimates the causal effect of a treatment by comparing the changes in outcomes over time between a treatment group and a control group.
+
+## Class: `DifferenceInDifferences`
+
+```python
+causalpy.experiments.DifferenceInDifferences(
+    data,
+    formula,
+    time_variable_name,
+    group_variable_name,
+    post_treatment_variable_name="post_treatment",
+    model=None,
+    **kwargs
+)
+```
+
+### Parameters
+*   **`data`** (`pd.DataFrame`): Input dataframe containing panel data.
+*   **`formula`** (`str`): Statistical formula (e.g., `"y ~ 1 + group * post_treatment"`).
+*   **`time_variable_name`** (`str`): Column name for the time variable.
+*   **`group_variable_name`** (`str`): Column name for the group indicator (0=Control, 1=Treated). **Must be dummy coded**.
+*   **`post_treatment_variable_name`** (`str`): Column name indicating the post-treatment period (0=Pre, 1=Post). Default is `"post_treatment"`.
+*   **`model`**: A PyMC model (e.g., `cp.pymc_models.LinearRegression`) or a Scikit-Learn Regressor.
+
+### How it Works
+1.  **Fit**: The model fits all available data (pre/post, treatment/control).
+2.  **Counterfactual**: Predicted by setting the interaction term between `group` and `post_treatment` to 0.
+3.  **Impact**: The causal impact is the difference between observed and counterfactual.
+
+### Example
+
+```python
+import causalpy as cp
+import causalpy.pymc_models as cp_pymc
+
+df = cp.load_data("did")
+
+result = cp.DifferenceInDifferences(
+    df,
+    formula="y ~ 1 + group*post_treatment",
+    time_variable_name="t",
+    group_variable_name="group",
+    model=cp_pymc.LinearRegression(sample_kwargs={"target_accept": 0.9})
+)
+
+result.summary()
+result.plot()
+```
diff --git a/.claude/skills/performing-causal-analysis/reference/interrupted_time_series.md b/.claude/skills/performing-causal-analysis/reference/interrupted_time_series.md
@@ -0,0 +1,51 @@
+# Causal Interrupted Time Series (ITS)
+
+Interrupted Time Series (ITS) analyzes the effect of an intervention on a single time series by comparing the trend before and after the intervention.
+
+## Class: `InterruptedTimeSeries`
+
+```python
+causalpy.experiments.InterruptedTimeSeries(
+    data,
+    treatment_time,
+    formula,
+    model=None,
+    **kwargs
+)
+```
+
+### Parameters
+*   **`data`** (`pd.DataFrame`): Input dataframe. Index should ideally be a `pd.DatetimeIndex`.
+*   **`treatment_time`** (`Union[int, float, pd.Timestamp]`): The point in time when the intervention occurred.
+*   **`formula`** (`str`): Statistical formula (e.g., `"y ~ 1 + t + C(month)"`).
+*   **`model`**: A PyMC model (e.g., `cp.pymc_models.LinearRegression`) or a Scikit-Learn Regressor.
+
+### How it Works
+1.  **Split**: Data is split into pre- and post-intervention.
+2.  **Fit**: Model is trained **only on pre-intervention data**.
+3.  **Predict**: Fitted model predicts the outcome for the post-intervention period.
+4.  **Impact**: Difference between observed post-intervention data and counterfactual predictions.
+
+### Example
+
+```python
+import causalpy as cp
+import causalpy.pymc_models as cp_pymc
+import pandas as pd
+
+df = cp.load_data("its")
+df["date"] = pd.to_datetime(df["date"])
+df.set_index("date", inplace=True)
+
+treatment_time = pd.to_datetime("2017-01-01")
+
+result = cp.InterruptedTimeSeries(
+    df,
+    treatment_time,
+    formula="y ~ 1 + t + C(month)",
+    model=cp_pymc.LinearRegression()
+)
+
+result.summary()
+result.plot()
+```
diff --git a/.claude/skills/performing-causal-analysis/reference/synthetic_control.md b/.claude/skills/performing-causal-analysis/reference/synthetic_control.md
@@ -0,0 +1,49 @@
+# Causal Synthetic Control (SCG)
+
+Synthetic Control constructs a "synthetic" counterfactual unit using a weighted combination of untreated control units.
+
+## Class: `SyntheticControl`
+
+```python
+causalpy.experiments.SyntheticControl(
+    data,
+    treatment_time,
+    control_units,
+    treated_units,
+    model=None,
+    **kwargs
+)
+```
+
+### Parameters
+*   **`data`** (`pd.DataFrame`): Input dataframe containing panel data.
+*   **`treatment_time`** (`Union[int, float, pd.Timestamp]`): The time of intervention.
+*   **`control_units`** (`List[str]`): List of column names representing the control units.
+*   **`treated_units`** (`List[str]`): List of column names representing the treated unit(s).
+*   **`model`**: A PyMC model (typically `cp.pymc_models.WeightedSumFitter`) or a Scikit-Learn Regressor.
+
+### How it Works
+1.  **Fit**: Model learns weights for `control_units` to approximate `treated_units` using **only pre-intervention data**.
+2.  **Predict**: Weights are applied to `control_units` in post-intervention period.
+3.  **Impact**: Difference between observed treated unit and synthetic counterfactual.
+
+### Example
+
+```python
+import causalpy as cp
+import causalpy.pymc_models as cp_pymc
+
+df = cp.load_data("sc")
+treatment_time = 70
+
+result = cp.SyntheticControl(
+    df,
+    treatment_time,
+    control_units=["a", "b", "c", "d", "e"],
+    treated_units=["actual"],
+    model=cp_pymc.WeightedSumFitter()
+)
+
+result.summary()
+result.plot()
+```
diff --git a/.claude/skills/running-placebo-analysis/SKILL.md b/.claude/skills/running-placebo-analysis/SKILL.md
@@ -0,0 +1,25 @@
+---
+name: running-placebo-analysis
+description: Performs placebo-in-time sensitivity analysis to validate causal claims. Use when checking model robustness, verifying lack of pre-intervention effects, or ensuring observed effects are not spurious.
+---
+
+# Running Placebo Analysis
+
+Executes placebo-in-time sensitivity analysis to validate causal experiments.
+
+## Workflow
+
+1.  **Define Experiment Factory**: Create a function that returns a fitted CausalPy experiment (e.g., ITS, DiD, SC) given a dataset and time boundaries.
+2.  **Configure Analysis**: Initialize `PlaceboAnalysis` with the factory, dataset, intervention dates, and number of folds (cuts).
+3.  **Run Analysis**: Execute `.run()` to fit models on pre-intervention data folds.
+4.  **Evaluate Results**: Compare placebo effects (which should be null) to the actual intervention effect. Use histograms and hierarchical models to quantify the "status quo" distribution.
+
+## Key Concepts
+
+*   **Placebo-in-time**: Simulating an intervention at a time when none occurred to check if the model falsely detects an effect.
+*   **Fold**: A slice of pre-intervention data used to test a placebo period.
+*   **Factory Pattern**: Decouples the placebo logic from the specific CausalPy experiment type.
+
+## References
+
+*   [Placebo-in-time Implementation](reference/placebo_in_time.md): Full code for the `PlaceboAnalysis` class, usage examples, and hierarchical status-quo modeling.