-
Notifications
You must be signed in to change notification settings - Fork 1
Resolve review comments: Fix unit tests and cleanup artifacts (new) #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ibm_results
Are you sure you want to change the base?
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
This new commit for .gitignore addresses your comment from the previous PR on whether it ignores the data for other examples as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rerun this from scratch or not touch it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the data for this is not available within stochastic_benchmark itself. In the notebook it says to go to __results in the QED-C repository, but I do not find it there.
- Expand IBM_QAOA patterns to all examples subdirectories - Add general artifact patterns (plots, checkpoints, progress) - Add data file patterns (pkl, npz, npy) - Ignore accidentally created repo root directory
Add comprehensive test fixtures for IBM QAOA data processing: - 4 real experimental JSON files with varied instance/depth configurations - 4 synthetic edge case fixtures (multi-trial, missing fields, empty) - README documenting schema, boundaries, and usage patterns Fixtures support testing of: - Single vs multi-trial bootstrap behavior - IBM-specific filename parsing - Missing field handling - Empty/malformed data edge cases
Add 32 tests organized in 6 test classes covering: Unit Tests: - QAOAResult dataclass creation - parse_qaoa_trial with various data formats - load_qaoa_results with missing/malformed data - convert_to_dataframe transformations - group_name_fcn filename parsing - prepare_stochastic_benchmark_data pickle I/O Integration Tests: - process_qaoa_data end-to-end pipeline - GTMinEnergy injection for missing ground truth - Single-trial bootstrap fabrication - Interpolation fallback behavior - Train/test split generation Edge Cases: - Missing trainer information - Missing optimal parameters - Empty trials list - Multi-trial synthetic data All tests use fixtures and proper mocking for multiprocessing. Test coverage validates IBM-specific logic boundaries.
Implement 4 high-impact optimizations for ~1K file scale: 1. ProcessingConfig dataclass for centralized configuration: - persist_raw: gate pickle writes during ingestion - interpolate_diversity_threshold: diversity-based interpolation - fabricate_single_trial: control single-trial bootstrap - seed: reproducible train/test splits - log_progress_interval: configurable progress logging 2. Structured logging infrastructure: - Replace print statements with logging module - Add progress logging every N files - Proper INFO/WARNING levels for errors - Timestamps and levels for production observability 3. In-memory aggregation with conditional pickle persistence: - persist_raw=True: write pickles to exp_raw/ subdirectory - persist_raw=False: aggregate in memory, skip ingestion pickles - Generate temporary pickles only when needed for bootstrap - Expected 1-2s savings for 1K files when disabled 4. Diversity-based interpolation heuristic: - Replace row count (n_rows <= 5) with diversity metric - diversity = unique_instances × unique_depths - Skip interpolation when diversity < threshold - Prevents spurious skips on sparse but valid grids Additional improvements: - Add try/except for malformed JSON files with warnings - Use config.seed for reproducible train/test splits - Fix pickle paths to use exp_raw subdirectory convention - Add enumeration to ingestion loop for progress tracking All changes maintain backward compatibility with default config. Expected performance improvement: ~15s for 1K files (from ~20-25s).
Add comprehensive performance optimization guide: Phase 1 - Implemented (4 changes): - ProcessingConfig dataclass for centralized configuration - Structured logging infrastructure - In-memory aggregation with persist_raw flag - Diversity-based interpolation heuristic Phase 2 - Deferred Enhancements (6 optimizations): 1. Parallel I/O with ThreadPoolExecutor (3-5x potential speedup) 2. Parquet output format (faster writes, smaller files) 3. orjson for JSON parsing (~2x speedup) 4. Lazy bootstrap fabrication (skip unnecessary computation) 5. Categorical dtypes for memory efficiency 6. Rich diversity metrics (entropy-based quality assessment) Each enhancement documented with: - Problem/solution description - Expected impact and thresholds - Implementation complexity - Testing requirements Target metrics: - Phase 1: <15s for 1K files (from ~20-25s baseline) - Phase 2: <10s with parallelization - Scale guidance: When to apply each optimization
Clear execution outputs and intermediate results to reduce repo size. Notebook structure and analysis code preserved.
Add ibm_qaoa_analysis_hardware.ipynb for analyzing real quantum hardware results from IBM systems. Complements simulation analysis with hardware- specific metrics and comparisons.
| - `###` - Instance ID (e.g., 000, 001, 002) | ||
| - `N##R3R` - Problem size indicator | ||
| - `_#.json` - Depth parameter (p) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you verify if this is true @anurag-r20 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except N is the problem size and R3R is the type of graph. p is not the depth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, let's modify this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's split the N## line from the R#R lkine and explain all the alternatives (heavy-hex, Erdos-renyi, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ready for review? There is a lot of commented code and unavailable code. Some of it seemed copied from the ibm_qaoa_analysis notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that is work in progress,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, let's remove it from this PR to have this merged
Move all pandas-specific test files from tests/Pandas_Group_Tests/ to tests/: - test_interpolate_pandas.py - test_stats_pandas.py - test_stochastic_benchmark_pandas.py - test_training_pandas.py Remove empty Pandas_Group_Tests subdirectory for better test organization. All 13 tests still passing after move.
- Update processing script to detect three optimization states: 'opt', 'noOpt', and None - Add marker differentiation in plotting: circles for opt, x for noOpt, squares for no flag - Use depth-specific colors for all marker types with appropriate legend labels - Extract optimization flag from filename patterns (_opt_, _noOpt_, or neither) - Fallback to Energy metric when Approximation Ratio not available in JSON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be rerun without the Opt vs noOpt distinction as they are two separate solvers. In fact, have the solver_namemerge them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be run from scratch
| - `###` - Instance ID (e.g., 000, 001, 002) | ||
| - `N##R3R` - Problem size indicator | ||
| - `_#.json` - Depth parameter (p) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's split the N## line from the R#R lkine and explain all the alternatives (heavy-hex, Erdos-renyi, ...)
- Load minmax cuts from JSON files in R3R/minmax_cuts directory - Add maxcut_approximation_ratio() function using formula: cut_val = energy + 0.5 * sum_weights approx_ratio = (cut_val - min_cut) / (max_cut - min_cut) - Update convert_to_dataframe() to use calculated approximation ratios - Update process_qaoa_data() to load and pass minmax data - Add proper error handling and validation for edge cases
- Test minmax cuts loading from directory - Test approximation ratio calculation - Test end-to-end processing with minmax integration - Verify non-NaN approximation ratios in output
…d comparison - Update data loading to use 'optimized' column from process_qaoa_data - Create separate method names (FA_opt, FA_noOpt, TQA_opt, TQA_noOpt) - Update methods_to_compare list to include all 7 method variants - Change legends to single-column layout for better readability - Invalidate cache to force reprocessing with new method names - All variants treated independently in statistical analysis and rankings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be run from scratch
I have pushed updates to address all the review comments: I am creating a new PR because the previous PR was merged without the updates
Unit Tests: I reverted test_stochastic_benchmark.py, test_stats_pandas.py, and test_training_pandas.py to their original logic (restoring the names library and mock setups) to ensure the tests run correctly against the codebase.
Interpolate Test: I reverted the fixture to use the original manual parameters and added a reset_index fix to handle the MultiIndex output correctly while keeping the strict assertions.
Formatting: Applied the requested whitespace/docstring formatting.
Artifacts: I updated .gitignore to handle the specific example folders and removed the accidental JSON file.