Skip to content

Conversation

@lmeyerov
Copy link
Contributor

@lmeyerov lmeyerov commented Nov 26, 2025

Stacked on feat/issue-837-cudf-hop-executor. Adds Alloy model coverage for GFQL F/B/F with WHERE lowerings (scenarios + optional multi-chain full-scope) plus ghcr caching for the Alloy image.

Status: Experimental

This Alloy model is experimental infrastructure for formal verification of GFQL internals. It currently verifies WHERE clause lowering to per-alias value summaries (Eq/Neq/Lt/Gt via min/max).

What it verifies now

  • WHERE clause lowering logic (equality via bitsets, inequality via min/max)
  • Path semantics ↔ Set semantics equivalence for fixed-length chains
  • Various graph topologies: fan-out, fan-in, cycles, parallel edges, disconnected

What it does NOT verify (and where bugs actually occurred)

The 5 bugs found during executor development (PR #846) were all in traversal mechanics, not WHERE filtering:

  1. Backward traversal join direction
  2. Empty set short-circuit
  3. Node source for non-adjacent aliases
  4. Multi-hop path tracing
  5. Edge direction handling

These are caught by Python parity tests (reference enumerator vs native executor), not the Alloy model.

Next steps for bang-for-buck

Rather than extending Alloy to model DataFrame join mechanics, higher value options:

  • Cypher TCK conformance tests - test user-facing behavior with openCypher existing test suite
  • Property-based testing - Hypothesis-style fuzzing of graph patterns
  • Keep Alloy as design documentation, not primary verification

See issue #871 for the full roadmap discussion.

@lmeyerov lmeyerov force-pushed the feat/issue-837-cudf-hop-executor branch from 4cf85dd to 4174258 Compare December 24, 2025 09:52
@lmeyerov lmeyerov force-pushed the feat/issue-838-alloy-fbf-where branch from 8cc2f4c to 4548ef4 Compare December 24, 2025 09:54
@lmeyerov lmeyerov force-pushed the feat/issue-838-alloy-fbf-where branch from 4548ef4 to 3b9746c Compare December 24, 2025 09:57
@lmeyerov lmeyerov force-pushed the feat/issue-838-alloy-fbf-where branch from 3b9746c to 5e11f5b Compare December 24, 2025 16:25
@lmeyerov lmeyerov force-pushed the feat/issue-838-alloy-fbf-where branch from 5e11f5b to 6de43c3 Compare December 24, 2025 16:29
lmeyerov and others added 23 commits December 26, 2025 06:42
P0/P1 tests for cuDF same-path executor with hop range features.

Tests added:
- WHERE respected after min_hops backtracking (xfail #872)
- Reverse direction + hop range + WHERE (xfail #872)
- Non-adjacent alias WHERE (xfail #872)
- Oracle vs cuDF parity comprehensive (xfail #872)
- Multi-hop edge WHERE filtering (xfail #872)
- Output slicing + WHERE (PASS)
- label_seeds + output_min_hops (PASS)
- Multiple WHERE + mixed hop ranges (xfail #872)

6 tests marked xfail documenting multi-hop backward prune bugs.
2 tests pass verifying output slicing and label_seeds work correctly.

See issue #872 for bug details.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…cutor

The same-path executor (used by both pandas and cuDF backends) had a
correctness bug where WHERE clauses were silently skipped for multi-hop
edges (min_hops/max_hops > 1). This could return incorrect query results
regardless of whether using pandas or cuDF.

Changes:
- Add `_filter_multihop_by_where()` to handle WHERE for multi-hop edges
- Identify first/last hop edges using hop labels
- Cross-join start/end pairs and apply WHERE to filter valid paths
- Include intermediate nodes in `_materialize_filtered()` for multi-hop

Tests updated:
- Remove xfail from 3 tests that now pass:
  - test_reverse_direction_where_semantics
  - test_oracle_cudf_parity_comprehensive
  - test_multi_hop_edge_where_filtering
- 3 tests remain xfail for known oracle parity bugs (see #872)

Fixes part of #872.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The executor works with both pandas and cuDF DataFrames, so the name
was misleading. Renamed for clarity:

- cudf_executor.py → df_executor.py
- CuDFSamePathExecutor → DFSamePathExecutor
- test_cudf_executor_inputs.py → test_df_executor_inputs.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…cation

Session 3-5 bug fixes:
- Fix multi-hop path tracing in _apply_non_adjacent_where_post_prune
- Fix _filter_multihop_by_where hop column handling
- Fix reverse edge handling in _filter_edges_by_clauses
- Fix single-hop edge persistence after WHERE filtering
- Fix equality filtering when left_col == right_col (merge suffix)
- Fix edge filtering in _re_propagate_backward for multi-hop edges
- Add _filter_multihop_edges_by_endpoints helper for proper path tracing
- Add _find_multihop_start_nodes helper for backward propagation
- Add comprehensive undirected edge support throughout executor

Test amplification (37 new tests):
- 8 single-hop topology + cycle tests
- 3 unfiltered start tests (converted from xfail)
- 4 P0 reverse + multi-hop tests
- 3 P0 multiple starts tests
- 6 P1 operators × single-hop tests
- 6 P1 operators × multi-hop tests
- 2 P1 undirected + multi-hop tests
- 3 P1 mixed direction chain tests
- 4 P2 longer path tests
- 6 P2 edge case tests

All 78 tests pass, 2 skipped, 1 xfail (oracle limitation).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace Python set/dict intermediates with DataFrame operations throughout
the same-path executor to enable GPU (cuDF) compatibility:

Round 1 (prior commit):
- Remove BFS/DFS while loops
- Replace for...zip() iterations with merge operations
- Remove adjacency dict lookups

Round 2 (this commit):
- Replace dict(zip()) with DataFrame slice+rename
- Replace set(tolist()) tracking with DataFrame anti-joins
- Use pd.concat() + drop_duplicates() instead of set unions
- Use merge(..., indicator=True) for "not seen" logic

Key changes:
- _apply_non_adjacent_where_post_prune: vectorized value lookups
- _filter_multihop_edges_by_endpoints: DataFrame-based reachability
- _find_multihop_start_nodes: DataFrame anti-join for visited tracking
- _filter_multihop_by_where: DataFrame extraction for start/end nodes
- _materialize_filtered: DataFrame concat for allowed node collection

Remaining boundary issues documented in plan.md for future Round 3
(PathState refactor) and Round 4 (pay-as-you-go complexity).

All 91 tests pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add profiling infrastructure to measure executor performance across
different scenarios:
- Varying graph sizes (100 to 10K nodes)
- Simple vs multi-hop chains
- With and without WHERE clauses
- Sparse vs dense graphs

Initial findings:
- Multi-hop ~2x slower than simple queries (95-110ms vs 40-50ms)
- Graph size 100→10K only adds ~10ms (fixed costs dominate)
- WHERE clauses add minimal overhead
- Bottleneck is likely executor setup, not data processing

This helps inform optimization priorities for Round 3 (_PathState
refactor) and Round 4 (pay-as-you-go complexity).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add cProfile-based analysis to identify actual hotspots in query execution.

Key findings:
- Round 3 (_PathState refactor) is LOW PRIORITY - set operations
  are not the bottleneck, df_executor functions don't appear in top hotspots
- Oracle enumeration takes 38% of same-path executor time
- Legacy hop.py takes 75% of time in simple queries
- Multihop is FASTER than simple for large graphs (less data returned)
- Materialization (large result sets) dominates, not filtering

Updated profiling scenarios to include 100K+ nodes.
Extended plan.md with detailed cProfile results and insights.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add early return for empty allowed_nodes sets in _materialize_filtered
- Add early return for empty edges in _filter_edges_by_clauses
- Use original graph nodes in _apply_non_adjacent_where_post_prune
- Fix _find_multihop_start_nodes backward traversal join logic
- Add _run_native method as alias for _run_gpu
- Add 10 new tests for impossible/contradictory constraints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused variable assignments (relevant_node_indices, edge_id_col, max_hop)
- Move binary operators to start of line (W504 compliance)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add type annotations for stack variable in enumerator.py
- Add type: ignore comments for iterrows() which returns ambiguous types
- Add isinstance(edge_op, ASTEdge) checks to narrow types before accessing ASTEdge attributes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Clarifies what IS and is NOT formally verified in Alloy model
- Hop ranges approximated by unrolling (not fully verified)
- Output slicing treated as post-filter
- References Python parity tests for unverified features
- Adds PLAN-846-852-feature-composition.md tracking document

See issues #871 (roadmap) and #872 (multi-hop bugs)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@lmeyerov lmeyerov force-pushed the feat/issue-838-alloy-fbf-where branch from 5429779 to e1a7cec Compare December 29, 2025 01:20
lmeyerov and others added 2 commits December 28, 2025 17:22
- Add ContradictoryWhere predicate to detect impossible constraint pairs
  (e.g., a.v < c.v AND a.v > c.v)
- Add ContradictoryWhereEmpty assertion verifying empty output for contradictions
- Update README with bugs found in PR #846 that inform future verification
- Reference issue #871 for P1/P2 verification roadmap items

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Attempted to add ContradictoryWhere assertion but found model's value
semantics don't cleanly support it (Eq checks intersection while Neq
checks empty intersection - these don't produce expected contradictions).

- Replace failed assertion with explanatory comment
- Update README to note limitation with pointer to Python test coverage
- TestImpossibleConstraints covers 10 contradictory constraint scenarios

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@lmeyerov lmeyerov force-pushed the feat/issue-837-cudf-hop-executor branch 2 times, most recently from b797e4e to 5f0f370 Compare January 2, 2026 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants