[Chore]Optimize MSIX Package Size: Implement On-Demand CUDA DLL Loading #543
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements an on-demand loading mechanism for CUDA DLLs, reducing the x64 MSIX package size by 18 MB (from 160 MB to 142 MB, 11% reduction) while maintaining full functionality for NVIDIA GPU users through automatic background downloads.
Motivation
The current x64 MSIX package includes
onnxruntime-genai-cuda.dll(~35 MB) for all users, regardless of their hardware configuration. This significantly increases download size and installation time for users without NVIDIA GPUs, who cannot utilize CUDA acceleration.Key Statistics:
Dependency Analysis
onnxruntime-genai-cuda.dllis a transitive native dependency introduced by Microsoft.ML.OnnxRuntimeGenAI.WinML v0.10.1 NuGet package, which provides Windows Machine Learning execution provider support for ONNX Runtime GenAI.CUDA acceleration benefits 15+ core AI scenarios in the application:
A. Multimodal Models (1 sample)
Microsoft.ML.OnnxRuntimeGenAIwithGeneratorandMultiModalProcessorB. Language Models (14 samples)
All samples using
ModelType.LanguageModelsleverage CUDA viaOnnxRuntimeGenAIChatClientFactory:Solution Design
Architecture
`
Application Startup Flow
flowchart TD A[Application Launch] --> B[MainWindow.Activate] B --> C[Delay 1 second] C --> D{NVIDIA GPU<br/>Present?} D -->|No| E[Skip - No Impact] D -->|Yes| F{CUDA DLL<br/>Available?} F -->|Yes| G[Use Existing DLL] F -->|No| H[Download from NuGet<br/>Silent Background] H --> I[CUDA Ready for Use]Settings Page UI Components
flowchart LR A[Settings Page] --> B[GPU Detection] A --> C[Status Display] G --> D[Manual Download Button] G --> E[Progress Indicator] B --> F{NVIDIA GPU?} F -->|Yes| G[Show Controls] F -->|No| H[Show Info Message]`
Implementation Components
1. CudaDllManager (AIDevGallery/Utils/CudaDllManager.cs)
DeviceUtils.GetEpDevices()to identify NVIDIA GPUsSemaphoreSlimensures single concurrent download2. MSBuild Exclusion (AIDevGallery/ExcludeExtraLibs.props)
3-stage exclusion ensures CUDA DLL removal:
Platform-Specific: Only applies to
win-x64; ARM64 unaffected3. User Experience
Testing
onnxruntime-genai.dll, etc.)Benefits
Risk Assessment
Low-Risk Characteristics
Package Size Reduction: 160 MB → 142 MB (-18 MB / -11%)
Breaking Changes: None
Dependencies Added: None