Skip to content

Conversation

@keighrim
Copy link
Member

addresses #243

claude and others added 8 commits November 8, 2025 11:48
- test_issue_243.py: Test script to replicate VRAM duplication issue
- ISSUE_243_ANALYSIS.md: Initial analysis of gunicorn/CUDA issue
- ISSUE_243_REAL_WORLD_ANALYSIS.md: Analysis of whisper-wrapper implementation

These are investigation/documentation artifacts, not SDK changes.
Removed outdated files:
- test_issue_243.py (app-level test, no longer relevant)
- ISSUE_243_ANALYSIS.md (superseded)
- ISSUE_243_REAL_WORLD_ANALYSIS.md (superseded)

New consolidated documentation:
- ISSUE_243_INVESTIGATION.md: Complete investigation with SDK-level solution

Key changes from previous analysis:
- Focus on SDK-level VRAM management (not app-level)
- Runtime VRAM checking via enhanced _profile_cuda_memory decorator
- _get_model_requirements() API for apps to declare memory needs
- Conservative worker count when CUDA detected
- Runtime status via ?includeVRAM=true parameter
- Addresses dynamic VRAM availability (not static calculation)
- Process-safe torch.cuda.empty_cache() usage documented
Updated investigation document with:
- Component 5: Automatic Memory Profiling
  - 80% VRAM requirement for first request (conservative)
  - Historical measurement for subsequent requests
  - Hash-based filenames for race-condition-safe persistence
  - Atomic writes via temp file + rename

- Updated request flow to show 3-level priority:
  1. App override (explicit)
  2. Historical measurement
  3. Conservative 80%

- Updated implementation checklist with new components
- Revised open questions and conclusion
@clams-bot clams-bot added this to infra Nov 21, 2025
@github-project-automation github-project-automation bot moved this to Todo in infra Nov 21, 2025
@keighrim keighrim force-pushed the claude/investigate-issue-243-011CUvLcJcFferWXmFu4nKu1 branch from e2309be to b7579c2 Compare November 21, 2025 03:35
@keighrim keighrim force-pushed the claude/investigate-issue-243-011CUvLcJcFferWXmFu4nKu1 branch from b7579c2 to 328c4c4 Compare November 21, 2025 11:05
@codecov
Copy link

codecov bot commented Nov 21, 2025

Codecov Report

❌ Patch coverage is 19.77401% with 142 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@69bdca9). Learn more about missing BASE report.
⚠️ Report is 2 commits behind head on develop.

Files with missing lines Patch % Lines
clams/app/__init__.py 18.79% 108 Missing ⚠️
clams/restify/__init__.py 6.06% 31 Missing ⚠️
clams/appmetadata/__init__.py 72.72% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop     #265   +/-   ##
==========================================
  Coverage           ?   59.45%           
==========================================
  Files              ?        6           
  Lines              ?      846           
  Branches           ?        0           
==========================================
  Hits               ?      503           
  Misses             ?      343           
  Partials           ?        0           
Flag Coverage Δ
unittests 59.45% <19.77%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@keighrim
Copy link
Member Author

this turned out to be a massive over-engineering based on wrong assumption and outdated information. Closing without merge. Will start a new PR to address the issue more neatly.

@keighrim keighrim closed this Nov 22, 2025
@github-project-automation github-project-automation bot moved this from Todo to Done in infra Nov 22, 2025
@keighrim keighrim deleted the claude/investigate-issue-243-011CUvLcJcFferWXmFu4nKu1 branch November 22, 2025 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants