Skip to content

Conversation

@csmangum
Copy link
Contributor

This commit introduces a new agents.py module containing three agent classes: SimpleAgent, MemoryAgent, and RandomAgent, designed for reinforcement learning tasks. The MemoryAgent enhances Q-learning with episodic memory using a MemorySpace. Additionally, a new maze.py module is added, providing a MazeEnvironment class that simulates a grid-based maze for agent navigation. The main_demo.py file is updated to utilize these new classes, facilitating experiments with memory-enabled and memory-disabled agents. Utility functions for converting NumPy types to Python types are also included in a new util.py module for JSON serialization.

This commit introduces a new `agents.py` module containing three agent classes: `SimpleAgent`, `MemoryAgent`, and `RandomAgent`, designed for reinforcement learning tasks. The `MemoryAgent` enhances Q-learning with episodic memory using a `MemorySpace`. Additionally, a new `maze.py` module is added, providing a `MazeEnvironment` class that simulates a grid-based maze for agent navigation. The `main_demo.py` file is updated to utilize these new classes, facilitating experiments with memory-enabled and memory-disabled agents. Utility functions for converting NumPy types to Python types are also included in a new `util.py` module for JSON serialization.
@csmangum csmangum requested a review from Copilot May 29, 2025 03:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces foundational components for a reinforcement learning setup by adding a utility for JSON-friendly NumPy conversions and a grid-based maze environment.

  • Add convert_numpy_to_python in util.py to turn NumPy types into native Python types for serialization.
  • Implement MazeEnvironment in maze.py with reset, observation, and step logic for a grid maze.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
memory/utils/util.py New helper to convert NumPy scalars and arrays to native Python.
maze.py New MazeEnvironment class with configurable size, obstacles, and reward logic.
Comments suppressed due to low confidence (1)

memory/utils/util.py:1

  • This utility function introduces new behavior but lacks corresponding unit tests. Consider adding tests for various NumPy types (integer, floating, ndarray, dict, list, tuple) to prevent regressions.
# Helper function to convert NumPy types to native Python types for JSON serialization


def convert_numpy_to_python(obj):
"""Convert NumPy types to standard Python types for JSON serialization."""
if isinstance(obj, np.integer):
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The converter currently handles integers and floats but omits NumPy boolean types (e.g., np.bool_). Consider adding elif isinstance(obj, np.bool_): return bool(obj) to ensure full coverage of common NumPy scalars.

Copilot uses AI. Check for mistakes.
reward (float): Reward for the action.
done (bool): Whether the episode has ended.
"""
# Actions: 0=up, 1=right, 2=down, 3=left
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The step method assumes action is always 0–3 and will raise an IndexError for invalid values. Add explicit input validation (e.g., if action not in range(4): raise ValueError) with a clear error message to improve API robustness.

Copilot uses AI. Check for mistakes.
)

# Check if valid move
if (
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Invalid moves (out of bounds or into obstacles) are silently ignored. If this is intended, clarify this behavior in the docstring (e.g., "Invalid actions result in no position change but still consume a step").

Copilot uses AI. Check for mistakes.
@csmangum csmangum merged commit 374776c into main May 29, 2025
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants