Skip to content

Conversation

@ajpotts
Copy link
Contributor

@ajpotts ajpotts commented Jan 9, 2026

SegArray NumPy Alignment Test Suite

Summary

This PR adds and refines a NumPy alignment test suite for SegArray, exercising behavior against
well-defined NumPy-style reference semantics. The goal is to clearly distinguish:

  • Correct behavior
  • Known divergences / bugs (explicitly marked with xfail)
  • Future work needed for full alignment

This improves test coverage while avoiding false negatives.


What’s Included

1. SegArray alignment tests

New/expanded tests in:

tests/numpy/alignment_verification/segarray_alignment.py

cover:

  • Concatenation (concat)
  • Uniqueness (unique)
  • Set operations (intersect, union, setdiff, setxor)
  • Aggregations (sum, etc.)

Each test compares Arkouda results against a NumPy-style reference implementation.


Known Issues Marked as xfail

The following behaviors are currently known issues and are explicitly marked as expected failures:

A. String concatenation

  • SegArray.concat(axis=1) fails for string dtype
  • Root cause: lack of multidimensional Strings support

B. unique() with empty segments

  • SegArray.unique() fails or errors when empty segments are present
  • Caused by invalid segment broadcasting

C. Set-ops constructing invalid segments

  • Some set-ops construct segment labels instead of offsets
  • Leads to ValueError: Segments must be unique and in sorted order

D. Set-ops ordering mismatch vs NumPy

  • Arkouda preserves stable/input order
  • NumPy returns sorted unique values
  • Membership matches; ordering differs

E. Aggregations drop empty segments

  • SegArray.sum() (and similar) drop empty segments
  • Expected: one result per segment (e.g. sum([]) == 0)
  • Actual: empty segments omitted

All of the above are documented inline with pytest.xfail(...) and explanatory messages.


Why This PR Is Useful

  • Makes NumPy alignment expectations explicit
  • Prevents regressions by locking in current behavior
  • Separates correctness bugs from API-contract decisions
  • Enables incremental fixes without rewriting tests

Follow-up Work (Out of Scope)

  • Decide and document ordering semantics for set-ops
  • Fix aggregation behavior for empty segments
  • Implement multidimensional Strings
  • Normalize segment construction in set-ops

Notes for Reviewers

  • All xfails are narrow and intentional
  • Removing an xfail should be paired with a corresponding fix
  • No production behavior is changed in this PR

Closes #5278: alignment tests for arkouda.numpy.segarray

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@168089e). Learn more about missing BASE report.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #5284   +/-   ##
========================================
  Coverage        ?   100.00%           
========================================
  Files           ?         4           
  Lines           ?        63           
  Branches        ?         0           
========================================
  Hits            ?        63           
  Misses          ?         0           
  Partials        ?         0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

alignment tests for arkouda.numpy.segarray

2 participants