feat(cache): implement O(1) eviction policies and O(k) TTL cleanup #781

asaadbalum · 2025-12-07T13:29:08Z

Summary

Implements O(1) eviction policies and O(k) TTL cleanup for the in-memory cache, addressing Issues 3 and 4 from #162.

Issues Addressed in This PR

Issue 3: Expensive TTL Cleanup ✅

Problem: cleanupExpiredEntries() examined ALL cache entries - O(n) complexity.

Solution: Implemented ExpirationHeap (min-heap ordered by expiration time).

Now O(k) where k = number of expired entries, not total entries
Only expired entries are processed, valid entries untouched

Issue 4: Expensive Eviction ✅

Problem: SelectVictim() scanned ALL entries to find eviction candidate - O(n) complexity.

Solution: Implemented O(1) eviction policies using proper data structures:

FIFO: Doubly-linked list queue
LRU: Doubly-linked list + hashmap (same approach as Redis)
LFU: Frequency buckets with doubly-linked lists

Performance Improvement

Policy	Before (O(n))	After (O(1))	Speedup
FIFO SelectVictim (10k entries)	~25,000 ns/op	~4.3 ns/op	~5,800x faster
LRU SelectVictim (10k entries)	~25,000 ns/op	~4.3 ns/op	~5,800x faster
LFU SelectVictim (10k entries)	~8,000 ns/op	~6.2 ns/op	~1,300x faster

Tests Added

TestFIFOPolicyOptimized - Tests FIFO O(1) operations (Evict, UpdateIndex, OnRemove)
TestLRUPolicyOptimized - Tests LRU O(1) operations (Evict, UpdateIndex, OnRemove, OnAccess)
TestLFUPolicyOptimized - Tests LFU O(1) operations (Evict, UpdateIndex, OnRemove, OnAccess)
TestExpirationHeapOperations - Tests all heap operations (Size, PeekNext, UpdateExpiration, PopExpired)
TestInMemoryCacheEviction - Tests eviction integration with InMemoryCache

Issues Already Addressed (Not in This PR)

Issue 1: Unnecessary Sorting in FindSimilar ✅

Problem: sort.Slice(results, ...) added O(n log n) overhead to find best match.

Already Fixed in PR #347: The sorting was removed. Now the best match is tracked during iteration - O(n) single pass, no sorting needed.

Issue 2: Linear Search for Similarity ✅

Problem: Computing dot product with every cache entry - O(n) per query.

Already Fixed in PR #504: HNSW (Hierarchical Navigable Small World) index was added for approximate nearest neighbor search - O(log n) average case.

FIX #162

netlify · 2025-12-07T13:29:13Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`b7c8d41`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/6936e3a79b1029000732fa20
😎 Deploy Preview	https://deploy-preview-781--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-12-07T13:29:21Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/pkg/cache/cache_test.go
src/semantic-router/pkg/cache/eviction_policy.go
src/semantic-router/pkg/cache/inmemory_cache.go

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Addresses Issue vllm-project#162: Implementation: - FIFO: Doubly-linked list queue for O(1) oldest entry selection - LRU: Doubly-linked list + hashmap for O(1) least-recently-used tracking - LFU: Frequency buckets with doubly-linked lists for O(1) least-frequently-used tracking - ExpirationHeap: Min-heap for O(k) TTL cleanup (k = expired entries) Performance (10k entries): - FIFO SelectVictim: ~4.3 ns/op - LRU SelectVictim: ~4.3 ns/op - LFU SelectVictim: ~6.2 ns/op Note: Issues 1 and 2 from vllm-project#162 were already addressed in PR vllm-project#347 and PR vllm-project#504. Signed-off-by: Asaad Balum <abalum@abalum-thinkpadp16vgen1.raanaii.csb>

Xunzhuo · 2025-12-08T03:54:50Z

can you fix the CI?

Signed-off-by: Asaad Balum <abalum@abalum-thinkpadp16vgen1.raanaii.csb>

asaadbalum · 2025-12-08T09:41:10Z

can you fix the CI?

@Xunzhuo
CI fixed, pr rebased to main, all checks pass.
Ready for re-review

asaadbalum requested review from Xunzhuo, rootfs and wangchen615 as code owners December 7, 2025 13:29

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Dec 7, 2025

asaadbalum force-pushed the feature/cache-efficiency-improvements-162 branch from 09d63b9 to 1a0a59d Compare December 7, 2025 14:26

Merge branch 'main' into feature/cache-efficiency-improvements-162

24e22bd

Asaad Balum and others added 2 commits December 8, 2025 09:02

fix: remove unused idx variable in test to pass ineffassign lint

e4efc78

Signed-off-by: Asaad Balum <abalum@abalum-thinkpadp16vgen1.raanaii.csb>

Merge branch 'main' into feature/cache-efficiency-improvements-162

98495a9

Merge branch 'main' into feature/cache-efficiency-improvements-162

b7c8d41

rootfs merged commit 2b5c6bc into vllm-project:main Dec 8, 2025
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cache): implement O(1) eviction policies and O(k) TTL cleanup #781

feat(cache): implement O(1) eviction policies and O(k) TTL cleanup #781

Uh oh!

asaadbalum commented Dec 7, 2025 •

edited

Loading

Uh oh!

netlify bot commented Dec 7, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 7, 2025 •

edited

Loading

Uh oh!

Xunzhuo commented Dec 8, 2025

Uh oh!

asaadbalum commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat(cache): implement O(1) eviction policies and O(k) TTL cleanup #781

feat(cache): implement O(1) eviction policies and O(k) TTL cleanup #781

Uh oh!

Conversation

asaadbalum commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Issues Addressed in This PR

Issue 3: Expensive TTL Cleanup ✅

Issue 4: Expensive Eviction ✅

Performance Improvement

Tests Added

Issues Already Addressed (Not in This PR)

Issue 1: Unnecessary Sorting in FindSimilar ✅

Issue 2: Linear Search for Similarity ✅

Uh oh!

netlify bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 src

🎉 Thanks for your contributions!

Uh oh!

Xunzhuo commented Dec 8, 2025

Uh oh!

asaadbalum commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

asaadbalum commented Dec 7, 2025 •

edited

Loading

netlify bot commented Dec 7, 2025 •

edited

Loading

github-actions bot commented Dec 7, 2025 •

edited

Loading

📁 `src`