Skip to content

Conversation

@asaadbalum
Copy link
Contributor

@asaadbalum asaadbalum commented Dec 7, 2025

Summary

Implements O(1) eviction policies and O(k) TTL cleanup for the in-memory cache, addressing Issues 3 and 4 from #162.

Issues Addressed in This PR

Issue 3: Expensive TTL Cleanup ✅

Problem: cleanupExpiredEntries() examined ALL cache entries - O(n) complexity.

Solution: Implemented ExpirationHeap (min-heap ordered by expiration time).

  • Now O(k) where k = number of expired entries, not total entries
  • Only expired entries are processed, valid entries untouched

Issue 4: Expensive Eviction ✅

Problem: SelectVictim() scanned ALL entries to find eviction candidate - O(n) complexity.

Solution: Implemented O(1) eviction policies using proper data structures:

  • FIFO: Doubly-linked list queue
  • LRU: Doubly-linked list + hashmap (same approach as Redis)
  • LFU: Frequency buckets with doubly-linked lists

Performance Improvement

Policy Before (O(n)) After (O(1)) Speedup
FIFO SelectVictim (10k entries) ~25,000 ns/op ~4.3 ns/op ~5,800x faster
LRU SelectVictim (10k entries) ~25,000 ns/op ~4.3 ns/op ~5,800x faster
LFU SelectVictim (10k entries) ~8,000 ns/op ~6.2 ns/op ~1,300x faster

Tests Added

  • TestFIFOPolicyOptimized - Tests FIFO O(1) operations (Evict, UpdateIndex, OnRemove)
  • TestLRUPolicyOptimized - Tests LRU O(1) operations (Evict, UpdateIndex, OnRemove, OnAccess)
  • TestLFUPolicyOptimized - Tests LFU O(1) operations (Evict, UpdateIndex, OnRemove, OnAccess)
  • TestExpirationHeapOperations - Tests all heap operations (Size, PeekNext, UpdateExpiration, PopExpired)
  • TestInMemoryCacheEviction - Tests eviction integration with InMemoryCache

Issues Already Addressed (Not in This PR)

Issue 1: Unnecessary Sorting in FindSimilar ✅

Problem: sort.Slice(results, ...) added O(n log n) overhead to find best match.

Already Fixed in PR #347: The sorting was removed. Now the best match is tracked during iteration - O(n) single pass, no sorting needed.

Issue 2: Linear Search for Similarity ✅

Problem: Computing dot product with every cache entry - O(n) per query.

Already Fixed in PR #504: HNSW (Hierarchical Navigable Small World) index was added for approximate nearest neighbor search - O(log n) average case.

FIX #162

@netlify
Copy link

netlify bot commented Dec 7, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit b7c8d41
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/6936e3a79b1029000732fa20
😎 Deploy Preview https://deploy-preview-781--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Dec 7, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/cache/cache_test.go
  • src/semantic-router/pkg/cache/eviction_policy.go
  • src/semantic-router/pkg/cache/inmemory_cache.go

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Addresses Issue vllm-project#162:

Implementation:
- FIFO: Doubly-linked list queue for O(1) oldest entry selection
- LRU: Doubly-linked list + hashmap for O(1) least-recently-used tracking
- LFU: Frequency buckets with doubly-linked lists for O(1) least-frequently-used tracking
- ExpirationHeap: Min-heap for O(k) TTL cleanup (k = expired entries)

Performance (10k entries):
- FIFO SelectVictim: ~4.3 ns/op
- LRU SelectVictim: ~4.3 ns/op
- LFU SelectVictim: ~6.2 ns/op

Note: Issues 1 and 2 from vllm-project#162 were already addressed in PR vllm-project#347 and PR vllm-project#504.
Signed-off-by: Asaad Balum <abalum@abalum-thinkpadp16vgen1.raanaii.csb>
@asaadbalum asaadbalum force-pushed the feature/cache-efficiency-improvements-162 branch from 09d63b9 to 1a0a59d Compare December 7, 2025 14:26
@Xunzhuo
Copy link
Member

Xunzhuo commented Dec 8, 2025

can you fix the CI?

Asaad Balum and others added 2 commits December 8, 2025 09:02
@asaadbalum
Copy link
Contributor Author

can you fix the CI?

@Xunzhuo
CI fixed, pr rebased to main, all checks pass.
Ready for re-review

@rootfs rootfs merged commit 2b5c6bc into vllm-project:main Dec 8, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improving the computational efficiency of the in-memory cache

4 participants