Skip to content

Conversation

@ericpan64
Copy link
Owner

@ericpan64 ericpan64 commented Nov 26, 2025

Refactors chidian to be more focused on what it was created to do: dict-like transformations that make code more readable and usable.

Incorporates some old patterns and learnings from pydian. Removes tabular abstractions, adds-back validation structure.


Summary by cubic

Reset chidian to focus on simple dict-to-dict transformations. Adds a lightweight API (grab, mapper, DROP/KEEP, validation) and removes table/DSL/legacy mapper to ship 0.2.0.

  • New Features

    • grab(path) for nested access with apply functions; strict mode via mapping_context (raises KeyError on missing keys).
    • mapper decorator that auto-processes DROP, KEEP, and removes empty values by default.
    • DROP sentinel (with upward propagation) and KEEP wrapper to preserve empty values; process_output utility.
    • Validation module: V, DictV, ListV, Required/Optional, comparison/string validators, validate(), to_pydantic(); Ok/Err result types.
    • Cleaner README and updated test suite; version bumped to 0.2.0.
  • Migration

    • get() renamed to grab(); put() removed.
    • Replace the old Mapper class and ValidationMode with @mapper and the new validation module.
    • Remove Table, Lexicon, DSL parsers, partials, and related helpers.
    • Update imports: from chidian import grab, mapper, mapping_context, DROP, KEEP, process_output.
    • Dependencies trimmed; pandas/polars and pyarrow support removed.

Written for commit 694acea. Summary will update automatically on new commits.

ericpan64 and others added 21 commits November 25, 2025 13:48
- Remove table module and all functionality
- Remove lexicon module and DSL parsers
  - Deletes filter and select PEG grammars
  - Removes parser implementations
- Update project configuration
  - Modifies pyproject.toml dependencies
  - Updates uv.lock with fewer dependencies
- Strip README documentation
- Clean up package exports in __init__.py
- Add KEEP wrapper to preserve empty values
  - Prevents removal of {}, [], '', None
  - Implements __init__, __repr__, __eq__
- Export KEEP from main package
- Document KEEP usage in README
- Add process_output function
  - Combines DROP, KEEP, and empty removal
  - Accepts remove_empty parameter (default True)
- Update README examples to use process_output
  - Replace process_drops with process_output
  - Add remove_empty option documentation
- Export process_output in __init__.py
- Add @Mapper decorator for data transformations
  - Auto-processes DROP, KEEP, and empty values
  - Supports remove_empty parameter
- Update README with decorator-based examples
  - Replace direct process_output calls
  - Show composable mapping functions
- Export mapper from chidian package
- Add context.py with mapping_context context manager using contextvars
- Update grab() to check strict mode from context
- In strict mode, missing keys raise KeyError instead of returning None
- Distinguishes between 'key not found' and 'key exists with None value'
- Update README with strict mode documentation
- Remove mapper.py (old Mapper class, replaced by @Mapper decorator)
- Remove partials.py (old FunctionChain/ChainableFunction utilities)
- Remove lib/data_mapping_helpers.py (old helper functions)
- Add test_grab.py for grab() function (renamed from get)
- Add test_mapper_new.py for @Mapper, DROP, KEEP, mapping_context
- Update test_lib.py to use grab
- Remove old test files: test_get.py, test_put.py, test_mapper.py,
  test_partials.py, test_data_mapping.py, test_property_based.py,
  test_types.py, structstest.py

All 46 tests passing
- Fix tests path link
- Document actual error type (KeyError) in strict mode
- Add Ok/Err result types as frozen dataclasses
- Add CheckFn, Path, ValidationError type aliases
- Add V dataclass with __and__/__or__ composition operators
- Add DictV for nested dict validation with field validators
- Add ListV for list validation with item validators
- Add to_validator() dispatch function for type coercion
- Required, Optional for presence validation
- IsType for type checking
- InRange, MinLength, MaxLength for length constraints
- InSet for enum-like validation
- Matches for regex patterns
- Predicate for custom functions
- Eq, Gt, Gte, Lt, Lte, Between for comparisons
- V: base validator with check function, composition via & and |
- DictV: nested dict validation with field validators
- ListV: list validation with item validator and length constraints
- to_validator(): type coercion dispatch function
- Required/Optional: presence modifiers
- IsType: type instance checking
- InRange/MinLength/MaxLength: length constraints
- InSet: enum-like validation
- Matches: regex pattern validation
- Predicate: custom predicate functions
- Eq/Gt/Gte/Lt/Lte/Between: comparison validators
- validate(): validate dict data against schema, returns Ok/Err
- to_pydantic(): compile schema to Pydantic BaseModel subclass
Exports: Ok, Err, V, DictV, ListV, to_validator, Required, Optional,
IsType, InRange, MinLength, MaxLength, InSet, Matches, Predicate,
Eq, Gt, Gte, Lt, Lte, Between, validate, to_pydantic
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 issues found across 42 files

Prompt for AI agents (all 13 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="chidian/decorator.py">

<violation number="1" location="chidian/decorator.py:11">
The mapper signature uses the PEP 604 `Callable | None` union without a `from __future__ import annotations`, which raises a SyntaxError on the project’s supported Python 3.8/3.9 versions. Use `Optional[Callable]` (and import Optional) or add the future import so the file can be parsed on supported interpreters.</violation>
</file>

<file name="chidian/process.py">

<violation number="1" location="chidian/process.py:61">
KEEP currently bypasses DROP processing entirely, so DROP sentinels inside KEEP-wrapped containers leak through instead of being applied.</violation>
</file>

<file name="chidian/validation/schema.py">

<violation number="1" location="chidian/validation/schema.py:81">
`match` statements require Python 3.10+, but this module is supposed to run on Python 3.8+, so it will not even import on the supported runtimes.</violation>

<violation number="2" location="chidian/validation/schema.py:88">
Returning `dict[str, Any]` / `dict[str, Any] | None` at runtime requires Python 3.9/3.10 features, so `_extract_pydantic_field` will crash on Python 3.8 even though that version is supported.</violation>

<violation number="3" location="chidian/validation/schema.py:93">
Using `list[item_type]` / `list[item_type] | None` at runtime requires Python 3.9/3.10 features, so optional/required list fields cannot be generated under the supported Python 3.8 runtime.</violation>
</file>

<file name="chidian/validation/types.py">

<violation number="1" location="chidian/validation/types.py:16">
`dataclass(slots=True)` requires Python 3.10+, but the project promises Python ≥3.8, so this module will crash during import on 3.8/3.9.</violation>

<violation number="2" location="chidian/validation/types.py:44">
Using `tuple[str | int, ...]` relies on Python 3.9/3.10-only syntax, so the module cannot even be parsed on Python 3.8 despite the declared support range.</violation>

<violation number="3" location="chidian/validation/types.py:45">
`tuple[Path, str]` depends on PEP 585 built-in generics, which do not exist on Python 3.8, so this assignment crashes at import on the supported interpreter versions.</violation>

<violation number="4" location="chidian/validation/types.py:46">
`list[ValidationError]` relies on Python 3.9’s PEP 585 generics, so the module cannot be imported on Python 3.8, contradicting the stated support range.</violation>
</file>

<file name="README.md">

<violation number="1" location="README.md:70">
`patient_summary` cannot be chained with `normalize_user` as written because the latter does not output the `data.*` structure `patient_summary` expects; the example will raise a KeyError.</violation>
</file>

<file name="chidian/core.py">

<violation number="1" location="chidian/core.py:31">
Docstring incorrectly states that strict mode raises ValueError for missing paths even though traverse_path actually propagates KeyError/IndexError/TypeError, which misleads callers about which exceptions to handle.</violation>
</file>

<file name="chidian/drop.py">

<violation number="1" location="chidian/drop.py:98">
DROP.PARENT used directly in a list removes only the list instead of its parent container, contradicting the documented semantics.</violation>

<violation number="2" location="chidian/drop.py:121">
DROP.GRANDPARENT and higher applied directly inside a list only remove ancestors one level too shallow because the propagation subtracts two levels.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

from .process import process_output


def mapper(_func: Callable | None = None, *, remove_empty: bool = True) -> Callable:
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapper signature uses the PEP 604 Callable | None union without a from __future__ import annotations, which raises a SyntaxError on the project’s supported Python 3.8/3.9 versions. Use Optional[Callable] (and import Optional) or add the future import so the file can be parsed on supported interpreters.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/decorator.py, line 11:

<comment>The mapper signature uses the PEP 604 `Callable | None` union without a `from __future__ import annotations`, which raises a SyntaxError on the project’s supported Python 3.8/3.9 versions. Use `Optional[Callable]` (and import Optional) or add the future import so the file can be parsed on supported interpreters.</comment>

<file context>
@@ -0,0 +1,52 @@
+from .process import process_output
+
+
+def mapper(_func: Callable | None = None, *, remove_empty: bool = True) -&gt; Callable:
+    &quot;&quot;&quot;
+    Decorator that transforms a mapping function into a callable mapper.
</file context>
Fix with Cubic

raise _DropSignal(data.value)

# Handle KEEP wrapper - unwrap and mark as preserved
if isinstance(data, KEEP):
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KEEP currently bypasses DROP processing entirely, so DROP sentinels inside KEEP-wrapped containers leak through instead of being applied.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/process.py, line 61:

<comment>KEEP currently bypasses DROP processing entirely, so DROP sentinels inside KEEP-wrapped containers leak through instead of being applied.</comment>

<file context>
@@ -0,0 +1,153 @@
+        raise _DropSignal(data.value)
+
+    # Handle KEEP wrapper - unwrap and mark as preserved
+    if isinstance(data, KEEP):
+        return data.value  # Return the wrapped value as-is, skip empty check
+
</file context>

✅ Addressed in 694acea

case ListV(required=req, items=items):
item_type, _ = _extract_pydantic_field(items)
if req:
return (list[item_type], ...) # type: ignore[valid-type]
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using list[item_type] / list[item_type] | None at runtime requires Python 3.9/3.10 features, so optional/required list fields cannot be generated under the supported Python 3.8 runtime.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/validation/schema.py, line 93:

<comment>Using `list[item_type]` / `list[item_type] | None` at runtime requires Python 3.9/3.10 features, so optional/required list fields cannot be generated under the supported Python 3.8 runtime.</comment>

<file context>
@@ -0,0 +1,96 @@
+        case ListV(required=req, items=items):
+            item_type, _ = _extract_pydantic_field(items)
+            if req:
+                return (list[item_type], ...)  # type: ignore[valid-type]
+            return (list[item_type] | None, None)  # type: ignore[valid-type]
+
</file context>
Fix with Cubic

return (TypingOptional[t or Any], None)
case DictV(required=req):
if req:
return (dict[str, Any], ...)
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning dict[str, Any] / dict[str, Any] | None at runtime requires Python 3.9/3.10 features, so _extract_pydantic_field will crash on Python 3.8 even though that version is supported.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/validation/schema.py, line 88:

<comment>Returning `dict[str, Any]` / `dict[str, Any] | None` at runtime requires Python 3.9/3.10 features, so `_extract_pydantic_field` will crash on Python 3.8 even though that version is supported.</comment>

<file context>
@@ -0,0 +1,96 @@
+            return (TypingOptional[t or Any], None)
+        case DictV(required=req):
+            if req:
+                return (dict[str, Any], ...)
+            return (dict[str, Any] | None, None)
+        case ListV(required=req, items=items):
</file context>
Fix with Cubic


def _extract_pydantic_field(v: V | DictV | ListV) -> tuple[Any, Any]:
"""Extract Pydantic field type and default from validator."""
match v:
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match statements require Python 3.10+, but this module is supposed to run on Python 3.8+, so it will not even import on the supported runtimes.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/validation/schema.py, line 81:

<comment>`match` statements require Python 3.10+, but this module is supposed to run on Python 3.8+, so it will not even import on the supported runtimes.</comment>

<file context>
@@ -0,0 +1,96 @@
+
+def _extract_pydantic_field(v: V | DictV | ListV) -&gt; tuple[Any, Any]:
+    &quot;&quot;&quot;Extract Pydantic field type and default from validator.&quot;&quot;&quot;
+    match v:
+        case V(required=True, type_hint=t):
+            return (t or Any, ...)
</file context>
Fix with Cubic


# Type aliases
CheckFn = Callable[[Any], bool]
Path = tuple[str | int, ...]
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tuple[str | int, ...] relies on Python 3.9/3.10-only syntax, so the module cannot even be parsed on Python 3.8 despite the declared support range.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/validation/types.py, line 44:

<comment>Using `tuple[str | int, ...]` relies on Python 3.9/3.10-only syntax, so the module cannot even be parsed on Python 3.8 despite the declared support range.</comment>

<file context>
@@ -0,0 +1,46 @@
+
+# Type aliases
+CheckFn = Callable[[Any], bool]
+Path = tuple[str | int, ...]
+ValidationError = tuple[Path, str]
+ValidationErrors = list[ValidationError]
</file context>
Fix with Cubic

README.md Outdated
from myproject.mappings import normalize_user, patient_summary

# Chain mappings
result = patient_summary(normalize_user(raw_data))
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patient_summary cannot be chained with normalize_user as written because the latter does not output the data.* structure patient_summary expects; the example will raise a KeyError.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At README.md, line 70:

<comment>`patient_summary` cannot be chained with `normalize_user` as written because the latter does not output the `data.*` structure `patient_summary` expects; the example will raise a KeyError.</comment>

<file context>
@@ -1,129 +1,289 @@
+from myproject.mappings import normalize_user, patient_summary
+
+# Chain mappings
+result = patient_summary(normalize_user(raw_data))
+```
+
</file context>

✅ Addressed in 694acea

chidian/core.py Outdated
Value at path or default if not found
Raises:
ValueError: In strict mode (via mapping_context), if path not found
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring incorrectly states that strict mode raises ValueError for missing paths even though traverse_path actually propagates KeyError/IndexError/TypeError, which misleads callers about which exceptions to handle.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/core.py, line 31:

<comment>Docstring incorrectly states that strict mode raises ValueError for missing paths even though traverse_path actually propagates KeyError/IndexError/TypeError, which misleads callers about which exceptions to handle.</comment>

<file context>
@@ -1,48 +1,57 @@
         Value at path or default if not found
+
+    Raises:
+        ValueError: In strict mode (via mapping_context), if path not found
+
+    Note:
</file context>
Suggested change
ValueError: In strict mode (via mapping_context), if path not found
KeyError | IndexError | TypeError: In strict mode, traverse_path propagates the underlying missing-path error

✅ Addressed in 694acea

chidian/drop.py Outdated
raise _DropSignal(0)
else:
# GRANDPARENT or higher - propagate up
raise _DropSignal(item.value - 2)
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DROP.GRANDPARENT and higher applied directly inside a list only remove ancestors one level too shallow because the propagation subtracts two levels.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/drop.py, line 121:

<comment>DROP.GRANDPARENT and higher applied directly inside a list only remove ancestors one level too shallow because the propagation subtracts two levels.</comment>

<file context>
@@ -0,0 +1,137 @@
+                raise _DropSignal(0)
+            else:
+                # GRANDPARENT or higher - propagate up
+                raise _DropSignal(item.value - 2)
+
+        try:
</file context>

✅ Addressed in 694acea

pass
elif signal.levels == 1:
# Remove this dict from its parent
raise _DropSignal(0)
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DROP.PARENT used directly in a list removes only the list instead of its parent container, contradicting the documented semantics.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/drop.py, line 98:

<comment>DROP.PARENT used directly in a list removes only the list instead of its parent container, contradicting the documented semantics.</comment>

<file context>
@@ -0,0 +1,137 @@
+                pass
+            elif signal.levels == 1:
+                # Remove this dict from its parent
+                raise _DropSignal(0)
+            else:
+                # Propagate further up
</file context>

✅ Addressed in 694acea

- Update version in pyproject.toml
- Update README.md documentation
  - Remove ambiguous examples
- Modify core.py and drop.py modules
  - KEEP now processes DROP sentiinels
- Fixed DROP-level propagation
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 6 files (reviewed changes from recent commits).

Prompt for AI agents (all 1 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="chidian/core.py">

<violation number="1" location="chidian/core.py:33">
`grab` still raises `ValueError` in strict mode (invalid path syntax or traversing None), but the updated Raises section now claims only KeyError/IndexError/TypeError are possible, which misleads callers about the function’s behavior.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

Raises:
KeyError: In strict mode, if a dict key is not found
IndexError: In strict mode, if a list index is out of range
TypeError: In strict mode, if a type mismatch occurs during traversal
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grab still raises ValueError in strict mode (invalid path syntax or traversing None), but the updated Raises section now claims only KeyError/IndexError/TypeError are possible, which misleads callers about the function’s behavior.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At chidian/core.py, line 33:

<comment>`grab` still raises `ValueError` in strict mode (invalid path syntax or traversing None), but the updated Raises section now claims only KeyError/IndexError/TypeError are possible, which misleads callers about the function’s behavior.</comment>

<file context>
@@ -28,12 +28,14 @@ def grab(
-        ValueError: In strict mode (via mapping_context), if path not found
+        KeyError: In strict mode, if a dict key is not found
+        IndexError: In strict mode, if a list index is out of range
+        TypeError: In strict mode, if a type mismatch occurs during traversal
 
     Note:
</file context>
Fix with Cubic

@ericpan64 ericpan64 merged commit 4309498 into main Nov 26, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants