Skip to content

Conversation

@leoyvens
Copy link
Collaborator

@leoyvens leoyvens commented Dec 8, 2025

This a proposal to simplify the compaction algorithm into essentially:

  • Iterate over the canonical chain in order.
    • If a file that is not under cooldown is found, start a compaction group.
      • Add files to that group until the target would be exceeded.

Generation-based compaction for files under cooldown is removed. Eager compaction configuration for recent files is simplified to be purely generation based, replacing the more complex eager_compaction_limit.

The predicate for Live files was returning `*size_exceeded` which meant
files could only join a group when combined size ALREADY exceeded the
target. Since two small files can never exceed 512 MiB, no groups were
ever formed in "Strict Eager Compaction" mode (when eager_compaction_limit
>= target_partition_size).

Changed to `!*size_exceeded` to match Cold file behavior: files can join
a group as long as combined size does NOT exceed the target. This allows
accumulating many small files until they reach the target size.
@leoyvens leoyvens force-pushed the leo/simplify-compaction-algo branch from f5ab4e8 to f137468 Compare December 8, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants