Skip to content

Conversation

@colehanan1
Copy link
Owner

@colehanan1 colehanan1 commented Jan 8, 2026

This pull request introduces support for ΔPER (opto-control subtraction) analysis, improves CLI flexibility, and adds robust handling for missing controls in the scripts/lasso_with_ablations.py pipeline. It also updates documentation to clarify ΔPER logic, introduces new diagnostics and planning documents, and enhances reproducibility and interpretability for LASSO behavioral prediction analyses.

Key changes include:

ΔPER (opto-control subtraction) support and CLI improvements

  • Added --subtract_control, --control_condition, and --missing_control_policy flags to the CLI, enabling ΔPER analysis and flexible control dataset handling. The CLI now supports multiple conditions via repeated or comma-separated --condition flags, and logs debug stats with --debug_stats. [1] [2] [3]
  • Implemented _build_valid_odorants to align and subtract opto/control responses according to the chosen missing control policy (skip, zero, or error), with robust error handling and logging.
  • Refactored run_baseline and run_ablation to support ΔPER mode and propagate debug logging and control policy options. [1] [2] [3] [4]

Documentation and diagnostics enhancements

  • Updated docs/BEHAVIORAL_PREDICTION_ANALYSIS.md to document ΔPER logic, missing control handling, and new stability layer usage. [1] [2]
  • Added new diagnostics and planning markdowns: explanations of LASSO collapse in ΔPER, baseline drift hypotheses, repo state snapshot, and a stability/metrics implementation plan. [1] [2] [3] [4]

Usability and reproducibility

  • Improved error messages and logging for missing or ambiguous conditions and controls, ensuring users are informed about skipped or failed ΔPER runs.
  • Provided reproducible commands and verification steps in documentation and diagnostics for transparency and validation of new features. [1] [2]

These changes collectively enable robust, transparent, and reproducible ΔPER analyses and lay the groundwork for further stability and metrics reporting in behavioral prediction workflows.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added diagnostic stability analysis tools for evaluating LASSO model behavior across conditions.
    • Introduced enhanced control subtraction and ΔPER (delta) support for behavioral prediction workflows.
    • Added comprehensive ablation and focus-mode analysis capabilities with full sweep functionality.
  • Documentation

    • Added diagnostic guidance on baseline drift hypotheses and model collapse behavior.
    • Updated behavioral prediction analysis documentation with stability layer guidance.
  • Tests

    • Added stability metrics validation and reproducibility testing.
  • Chores

    • Updated repository configuration for diagnostic file tracking.

✏️ Tip: You can customize this high-level summary in your review settings.

…ript

- Updated `run_lasso_behavioral_prediction.py` to change default missing control policy to "error" and added a new argument for lambda range delta.
- Implemented `run_lasso_full_sweep.py` to facilitate comprehensive LASSO sweeps, including baseline, ablations, and focus mode per condition.
- Expanded tests in `test_lasso_behavioral_prediction.py` to cover new control subtraction policies and added fixtures for extended testing scenarios.
- Added tests in `test_lasso_focus_mode.py` to ensure reproducibility and integrity of focus mode runs.
- Implement `run_stability_and_metrics.py` to compute stability scores and standardized metrics.
- Update documentation to include usage instructions for the new stability layer.
- Create unit tests for stability metrics, ensuring determinism and schema validation.
@colehanan1 colehanan1 merged commit 4be90f7 into main Jan 8, 2026
0 of 15 checks passed
@colehanan1 colehanan1 deleted the feature/lasso-partb-compat branch January 8, 2026 23:31
@coderabbitai
Copy link

coderabbitai bot commented Jan 8, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR adds comprehensive diagnostics and analysis infrastructure for LASSO-based behavioral prediction. It introduces new diagnostic scripts for auditing model pipelines and computing stability metrics, updates existing analysis scripts to support control-subtraction (ΔPER) runs with configurable missing-control policies, extends the core behavioral prediction module with helper utilities for ablation and feature restriction, adds documentation explaining LASSO collapse and drift hypotheses, introduces tests validating determinism and schema correctness, and updates .gitignore to track diagnostic outputs.

Changes

Cohort / File(s) Summary
Core Module Extensions
src/door_toolkit/pathways/behavioral_prediction.py
Added fit_behavior() method with control-subtraction support; new utilities: apply_receptor_ablation(), fit_lasso_with_fixed_scaler(), restrict_to_receptors(); scaling now uses StandardScaler.fit_transform() with new arrays; lambda selection via LassoCV with fixed random_state
New Diagnostic Scripts
diagnostics/run_postA_postB_audit.py, diagnostics/run_stability_and_metrics.py
run_postA_postB_audit.py (1034 lines): Orchestrates end-to-end audit with RunResult dataclass; performs LOOCV, grid searches (Ridge, ElasticNet, LASSO), reproducibility checks, mutation detection, generates audit_metrics.csv and AUDIT_SUMMARY.md. run_stability_and_metrics.py (780 lines): Computes stability metrics per receptor (ORN) across conditions; ModelFit class; outputs stability_per_condition.csv and model_metrics.csv
Analysis Workflow Scripts
scripts/lasso_with_ablations.py
Extended to support multi-condition runs; added control-based ΔPER with _parse_conditions(), _build_valid_odorants(), _log_debug_stats(); run_baseline() and run_ablation() now accept subtract_control, control_condition, missing_control_policy parameters; per-condition output directories
Analysis Workflow Scripts
scripts/lasso_with_focus_mode.py
Multi-condition support via _parse_conditions(); added control/ΔPER baselining with _build_valid_odorants(); run_baseline() and run_focus() now handle control-aligned odorants; per-condition output generation and debug stats logging
Analysis Workflow Scripts
scripts/run_lasso_behavioral_prediction.py
Changed default missing_control_policy from "skip" to "error"; added --lambda_range_delta CLI option for control-based runs; improved error handling to skip rather than fallback to raw mode on missing control
New Sweep Script
scripts/run_lasso_full_sweep.py
Comprehensive 860-line script implementing baseline, ablation (single/grouped/top-N variants), and focus modes across conditions; RunMetrics dataclass; includes _build_valid_odorants(), _extract_features(), weight/metrics serialization; generates per-condition and aggregated summaries (CSV/Markdown)
Documentation
.gitignore
Changed from ignoring diagnostics/ to ignoring diagnostics/\* while re-including directory itself (!diagnostics/) and whitelisting Python/Markdown files (!diagnostics/\.py, !diagnostics/\.md)
Documentation
diagnostics/DELTA_PER_COLLAPSE_EXPLANATION.md, diagnostics/PLAN_STABILITY.md, diagnostics/baseline_drift_hypotheses.md, diagnostics/repo_state.md
Four documentation files: collapse evidence and alternatives; stability/metrics planning with expected outputs; five baseline drift hypotheses; repo state changes including fit_behavior control-subtraction and new ablation/focus utilities
Documentation
docs/BEHAVIORAL_PREDICTION_ANALYSIS.md
Clarified missing-control handling (skip vs. error policy); added Stability Layer (LOOO) section with run command, expected outputs, and metrics guidance
Extended Tests
tests/test_lasso_behavioral_prediction.py
Added extended control test fixtures; new tests for control subtraction (raw alignment, policy validation); ablation/restriction shape integrity tests; imports restrict_to_receptors
New Tests
tests/test_lasso_focus_mode.py
New test_run_focus_reproducible_and_no_mutation() validating reproducibility across runs and input data integrity
New Tests
tests/test_stability_metrics.py
New 119-line module with test_stability_determinism_and_schema() (end-to-end determinism and output validation) and test_intercept_only_flag_logic() (internal flag verification); includes module loading and behavior CSV helpers

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AuditScript as run_postA_postB_audit.py
    participant Predictor as LassoBehavioralPredictor
    participant Models as LASSO/Ridge/ElasticNet
    participant Metrics as Metrics Engine
    participant Output as CSV/JSON/Markdown

    User->>AuditScript: CLI args (conditions, lambdas, modes)
    AuditScript->>Predictor: Initialize with DoOR cache & behavior data
    
    loop For each condition & mode
        AuditScript->>Predictor: Build valid odorants (with control-subtraction)
        Predictor-->>AuditScript: Features (X), targets (y), receptor names
        
        loop For each model class (LASSO/Ridge/ElasticNet)
            AuditScript->>Models: Fit with lambda grid & cross-validation
            Models-->>AuditScript: CV MSE, coefficients, predictions
            AuditScript->>Metrics: Compute NMSE, MAE, intercept-only baseline
            Metrics-->>AuditScript: RunResult (metrics & diagnostics)
        end
    end
    
    AuditScript->>Output: Write audit_metrics.csv
    AuditScript->>Output: Write audit_artifacts.json
    AuditScript->>Output: Generate AUDIT_SUMMARY.md (reproducibility checks)
    AuditScript->>User: Report (reproducibility, mutations, collapses)
Loading
sequenceDiagram
    participant User
    participant StabilityScript as run_stability_and_metrics.py
    participant Predictor as LassoBehavioralPredictor
    participant Models as LASSO/Ridge/ElasticNet
    participant Stability as Stability Analyzer
    participant Output as CSV/Markdown

    User->>StabilityScript: CLI args (conditions, modes, CV settings)
    StabilityScript->>Predictor: Initialize from cache & behavior data
    
    loop For each condition & prediction mode
        StabilityScript->>Predictor: Extract features (X, receptor names)
        Predictor-->>StabilityScript: Feature matrix & target vector
        
        alt Raw mode
            StabilityScript->>Models: Fit LASSO/Ridge/ElasticNet on lambda grid
        else Delta (ΔPER) mode
            StabilityScript->>Predictor: Subtract control from features
            StabilityScript->>Models: Fit models on control-adjusted data
        end
        
        Models-->>StabilityScript: ModelFit (coefficients, cv_mse, n_selected)
        StabilityScript->>Stability: Compute per-receptor selection frequency & sign consistency
        Stability-->>StabilityScript: Stability metrics row
    end
    
    StabilityScript->>Output: Write model_metrics.csv
    StabilityScript->>Output: Write stability_per_condition.csv
    StabilityScript->>Output: Generate SUMMARY.md (confidence flags & recommendations)
    StabilityScript->>User: Stability analysis complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

Audits and metrics, a hare's grand delight,
Sweep through the LASSO from morning to night, 🐰
Control-subtract wisely, ablate with care,
Stability checked—no drift hiding there,
Reproducible runs: deterministic and fair!

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d6d316 and f21301c.

📒 Files selected for processing (15)
  • .gitignore
  • diagnostics/DELTA_PER_COLLAPSE_EXPLANATION.md
  • diagnostics/PLAN_STABILITY.md
  • diagnostics/baseline_drift_hypotheses.md
  • diagnostics/repo_state.md
  • diagnostics/run_postA_postB_audit.py
  • diagnostics/run_stability_and_metrics.py
  • docs/BEHAVIORAL_PREDICTION_ANALYSIS.md
  • scripts/lasso_with_ablations.py
  • scripts/lasso_with_focus_mode.py
  • scripts/run_lasso_behavioral_prediction.py
  • scripts/run_lasso_full_sweep.py
  • tests/test_lasso_behavioral_prediction.py
  • tests/test_lasso_focus_mode.py
  • tests/test_stability_metrics.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants