Skip to content

Conversation

@ajpotts
Copy link
Contributor

@ajpotts ajpotts commented Dec 3, 2025

PR: Unified Indexing & Assignment for Arkouda Extension Arrays

This PR introduces a complete overhaul of __getitem__ and __setitem__ across all Arkouda-backed pandas ExtensionArray types:

  • ArkoudaArray
  • ArkoudaStringArray
  • ArkoudaCategoricalArray

The goal is to bring these classes into full alignment with pandas/NumPy indexing rules, while ensuring all operations are executed entirely on the Arkouda server (no unintended materialization, no to_numpy() or to_ndarray() inside assignment paths).


Summary of Changes

⭐ 1. Unified Indexer Normalization

All extension arrays now normalize Python / NumPy indexers the same way:

  • int, slice: forwarded directly to the underlying Arkouda object
  • NumPy integer arrays → ak.array(..., dtype=int64/uint64)
  • NumPy boolean masks → ak.array(..., dtype=bool)
  • Python lists of ints/bools → ak.array(...)
  • Arkouda pdarray indexers pass through unchanged
  • Empty indexers ([], np.array([], ...)) return an empty array of the same type

This ensures consistency across array types and avoids breaking pandas’ internal indexing machinery.


⭐ 2. New __getitem__ Implementations

ArkoudaArray

  • Mirrors pandas EA semantics exactly.
  • Returns a Python scalar for integer indexing and a new ArkoudaArray for everything else.
  • Full rewrite with examples and proper Raises section.
  • Now fully matches ExtensionArray.__getitem__ type signature.

ArkoudaStringArray

  • Correctly distinguishes scalar vs non-scalar returns.
  • Ensures doctest examples return object dtype arrays when round-tripped through NumPy.
  • Handles all indexer forms without falling through to unsupported Arkouda paths.

ArkoudaCategoricalArray

  • Returns a scalar Python string for scalar access and a wrapped categorical EA otherwise.
  • Empty indexers produce an empty categorical of type str_.

Closes #5105: ArkoudaExtensionArray.getitem

@ajpotts ajpotts force-pushed the 5105_ArkoudaExtensionArray_getitem branch from 4ed7c8b to 27a4bef Compare December 3, 2025 21:22
@ajpotts ajpotts marked this pull request as ready for review December 5, 2025 11:45
Copy link
Collaborator

@1RyanK 1RyanK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ajpotts ajpotts force-pushed the 5105_ArkoudaExtensionArray_getitem branch from 27a4bef to 78de756 Compare December 15, 2025 21:57
@ajpotts ajpotts force-pushed the 5105_ArkoudaExtensionArray_getitem branch from 7e7c2f3 to 8fabbbb Compare December 29, 2025 18:38
@ajpotts ajpotts force-pushed the 5105_ArkoudaExtensionArray_getitem branch from 8fabbbb to b50789d Compare January 13, 2026 19:43
@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@be6b999). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #5106   +/-   ##
========================================
  Coverage        ?   100.00%           
========================================
  Files           ?         4           
  Lines           ?        63           
  Branches        ?         0           
========================================
  Hits            ?        63           
  Misses          ?         0           
  Partials        ?         0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ajpotts ajpotts force-pushed the 5105_ArkoudaExtensionArray_getitem branch from b50789d to a20ae1f Compare January 16, 2026 17:28
@ajpotts ajpotts added the blocking This is blocking a developer from completing a task they are actively working. label Jan 16, 2026
@ajpotts ajpotts force-pushed the 5105_ArkoudaExtensionArray_getitem branch from 75aade4 to 7d55305 Compare January 16, 2026 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocking This is blocking a developer from completing a task they are actively working.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ArkoudaExtensionArray.__getitem__

2 participants