Skip to content

Conversation

@dividedmind
Copy link
Contributor

This commit ensures consistent UTF-8 character encoding for AppMap data across file system operations and HTTP responses, preventing corruption on systems where the default charset is not UTF-8 (e.g., Windows-1252).

Key changes:

  • RecordingSession.java: Enforce StandardCharsets.UTF_8 when creating writers for AppMap files, replacing reliance on default FileWriter.
  • Recording.java: Explicitly use StandardCharsets.UTF_8 in readFully to correctly decode AppMap content during retrieval.
  • ServletRequest.java: Set Content-Type: application/json; charset=UTF-8 for remote recording responses and calculate Content-Length based on UTF-8 byte size rather than string length.
  • agent/test/encoding/: Add a comprehensive regression test suite (encoding.bats, UnicodeTest.java, ReadFullyTest.java) to verify encoding handling for both reading and writing operations under non-UTF-8 environment settings.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes UTF-8 encoding issues in AppMap I/O operations to prevent data corruption on systems with non-UTF-8 default character sets (like Windows-1252). The fix ensures consistent UTF-8 encoding when writing AppMap files, reading them back, and serving them over HTTP.

Changes:

  • Enforced explicit UTF-8 charset specification for all file I/O operations (readers and writers)
  • Added charset specification to HTTP Content-Type headers and corrected Content-Length calculation
  • Introduced comprehensive regression tests to verify encoding behavior under non-UTF-8 system defaults

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
agent/src/main/java/com/appland/appmap/record/RecordingSession.java Updated to use OutputStreamWriter with UTF-8 instead of FileWriter for writing AppMap files
agent/src/main/java/com/appland/appmap/record/Recording.java Changed to use InputStreamReader with UTF-8 instead of FileReader for reading AppMap files
agent/src/main/java/com/appland/appmap/process/hooks/remoterecording/ServletRequest.java Added charset=UTF-8 to Content-Type headers and fixed Content-Length calculation to use byte count
agent/test/encoding/encoding.bats BATS test suite verifying UTF-8 encoding under Windows-1252 system defaults
agent/test/encoding/UnicodeTest.java Test class that exercises AppMap recording with Unicode content
agent/test/encoding/ReadFullyTest.java Test class validating Recording.readFully() with UTF-8 content
agent/test/encoding/encoding_test.cp1252 Test data file with Windows-1252 encoded content
agent/test/encoding/appmap.yml AppMap configuration for encoding tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dividedmind and others added 2 commits January 13, 2026 14:17
This commit ensures consistent UTF-8 character encoding for AppMap data
across file system operations and HTTP responses, preventing corruption
on systems where the default charset is not UTF-8 (e.g., Windows-1252).

Key changes:
- `RecordingSession.java`: Enforce `StandardCharsets.UTF_8` when creating
  writers for AppMap files, replacing reliance on default `FileWriter`.
- `Recording.java`: Explicitly use `StandardCharsets.UTF_8` in `readFully`
  to correctly decode AppMap content during retrieval.
- `ServletRequest.java`: Set `Content-Type: application/json; charset=UTF-8`
  for remote recording responses and calculate `Content-Length` based on
  UTF-8 byte size rather than string length.
- `agent/test/encoding/`: Add a comprehensive regression test suite
  (`encoding.bats`, `UnicodeTest.java`, `ReadFullyTest.java`) to verify
  encoding handling for both reading and writing operations under
  non-UTF-8 environment settings.
… leaks

The fastjson JSONWriter.close() method does not close the underlying
Writer passed to its constructor - it only closes the internal
SerializeWriter, which has no effect. This caused FileOutputStream and
OutputStreamWriter instances to leak in RecordingSession.

Changes:
- Make AppMapSerializer implement AutoCloseable and track the underlying
  Writer to close it explicitly
- Add try-with-resources for AppMapSerializer in checkpoint() to ensure
  proper cleanup on exceptions
- Add try-finally in stop() to ensure serializer is closed even if
  write() fails
- Consolidate JSON finalization and closing logic into close() method,
  removing the redundant finish() method

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants