[Chore]Optimize MSIX Package Size: Implement On-Demand CUDA DLL Loading #543

weiyuanyue · 2025-12-31T05:55:35Z

This PR implements an on-demand loading mechanism for CUDA DLLs, reducing the x64 MSIX package size by 18 MB (from 160 MB to 142 MB, 11% reduction) while maintaining full functionality for NVIDIA GPU users through automatic background downloads.

Motivation

The current x64 MSIX package includes onnxruntime-genai-cuda.dll (~35 MB) for all users, regardless of their hardware configuration. This significantly increases download size and installation time for users without NVIDIA GPUs, who cannot utilize CUDA acceleration.

Key Statistics:

Only a minority of users have access to CUDA‑enabled NVIDIA hardware.
CUDA DLL accounts for ~15% of current package size
DirectML provides acceptable performance fallback for non-NVIDIA users

Dependency Analysis

onnxruntime-genai-cuda.dll is a transitive native dependency introduced by Microsoft.ML.OnnxRuntimeGenAI.WinML v0.10.1 NuGet package, which provides Windows Machine Learning execution provider support for ONNX Runtime GenAI.

CUDA acceleration benefits 15+ core AI scenarios in the application:

A. Multimodal Models (1 sample)

DescribeImage.xaml.cs
- Direct usage: Microsoft.ML.OnnxRuntimeGenAI with Generator and MultiModalProcessor
- Models: Phi-3.5 Vision and other vision-language models

B. Language Models (14 samples)

All samples using ModelType.LanguageModels leverage CUDA via OnnxRuntimeGenAIChatClientFactory:

Solution Design

Architecture

`

Application Startup Flow

flowchart TD
    A[Application Launch] --> B[MainWindow.Activate]
    B --> C[Delay 1 second]
    C --> D{NVIDIA GPU<br/>Present?}
    D -->|No| E[Skip - No Impact]
    D -->|Yes| F{CUDA DLL<br/>Available?}
    F -->|Yes| G[Use Existing DLL]
    F -->|No| H[Download from NuGet<br/>Silent Background]
    H --> I[CUDA Ready for Use]

Settings Page UI Components

flowchart LR
    A[Settings Page] --> B[GPU Detection]
    A --> C[Status Display]
    G --> D[Manual Download Button]
    G --> E[Progress Indicator]
    
    B --> F{NVIDIA GPU?}
    F -->|Yes| G[Show Controls]
    F -->|No| H[Show Info Message]

`

Implementation Components

1. CudaDllManager (AIDevGallery/Utils/CudaDllManager.cs)

GPU Detection: Uses existing DeviceUtils.GetEpDevices() to identify NVIDIA GPUs
Download Strategy: Pulls official Microsoft.ML.OnnxRuntimeGenAI.Cuda v0.10.1 from NuGet
Thread Safety: SemaphoreSlim ensures single concurrent download
Error Handling: All exceptions caught; graceful degradation to DirectML

2. MSBuild Exclusion (AIDevGallery/ExcludeExtraLibs.props)

3-stage exclusion ensures CUDA DLL removal:

ResolvePackageAssets: Remove from NuGet package resolution
CopyFilesToOutputDirectory: Delete from build output
_GenerateAppxPackage: Exclude from MSIX payload

Platform-Specific: Only applies to win-x64; ARM64 unaffected

3. User Experience

Automatic Background Download: Starts 1 second post-launch (non-blocking)
Settings UI: "GPU Acceleration" card shows status and manual controls
Progress Feedback: Real-time download progress with cancellation support

Testing

Non-NVIDIA Users: No impact on DirectML execution path
NVIDIA Users: Automatic CUDA availability after first download
Offline Scenarios: Graceful degradation; no blocking behaviors
Existing OnnxRuntime: All required DLLs preserved (onnxruntime-genai.dll, etc.)

Benefits

11% smaller downloads: Faster installation, reduced bandwidth usage
Faster startup: Non-NVIDIA users skip unnecessary DLL loading
Transparent acceleration: NVIDIA users get CUDA automatically
User control: Manual download option in Settings

Risk Assessment

Low-Risk Characteristics

Non-invasive: Zero deletions of existing code
Optional feature: Download failures don't affect app functionality
Reversible: Can be safely reverted if issues arise
Platform-isolated: Only affects x64 Windows builds

Package Size Reduction: 160 MB → 142 MB (-18 MB / -11%)
Breaking Changes: None
Dependencies Added: None

Milly Wei (from Dev Box) added 3 commits December 31, 2025 12:22

remove cuda

808fde4

add ut

9166021

format

7e57f5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Chore]Optimize MSIX Package Size: Implement On-Demand CUDA DLL Loading #543

[Chore]Optimize MSIX Package Size: Implement On-Demand CUDA DLL Loading #543

Uh oh!

weiyuanyue commented Dec 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Chore]Optimize MSIX Package Size: Implement On-Demand CUDA DLL Loading #543

Are you sure you want to change the base?

[Chore]Optimize MSIX Package Size: Implement On-Demand CUDA DLL Loading #543

Uh oh!

Conversation

weiyuanyue commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Dependency Analysis

A. Multimodal Models (1 sample)

B. Language Models (14 samples)

Solution Design

Architecture

Application Startup Flow

Settings Page UI Components

Implementation Components

1. CudaDllManager (AIDevGallery/Utils/CudaDllManager.cs)

2. MSBuild Exclusion (AIDevGallery/ExcludeExtraLibs.props)

3. User Experience

Testing

Benefits

Risk Assessment

Low-Risk Characteristics

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

weiyuanyue commented Dec 31, 2025 •

edited

Loading