Skip to content

Conversation

@weiyuanyue
Copy link
Contributor

@weiyuanyue weiyuanyue commented Dec 31, 2025

This PR implements an on-demand loading mechanism for CUDA DLLs, reducing the x64 MSIX package size by 18 MB (from 160 MB to 142 MB, 11% reduction) while maintaining full functionality for NVIDIA GPU users through automatic background downloads.

Motivation

The current x64 MSIX package includes onnxruntime-genai-cuda.dll (~35 MB) for all users, regardless of their hardware configuration. This significantly increases download size and installation time for users without NVIDIA GPUs, who cannot utilize CUDA acceleration.

Key Statistics:

  • Only a minority of users have access to CUDA‑enabled NVIDIA hardware.
  • CUDA DLL accounts for ~15% of current package size
  • DirectML provides acceptable performance fallback for non-NVIDIA users

Dependency Analysis

onnxruntime-genai-cuda.dll is a transitive native dependency introduced by Microsoft.ML.OnnxRuntimeGenAI.WinML v0.10.1 NuGet package, which provides Windows Machine Learning execution provider support for ONNX Runtime GenAI.

CUDA acceleration benefits 15+ core AI scenarios in the application:

A. Multimodal Models (1 sample)

  • DescribeImage.xaml.cs
    • Direct usage: Microsoft.ML.OnnxRuntimeGenAI with Generator and MultiModalProcessor
    • Models: Phi-3.5 Vision and other vision-language models

B. Language Models (14 samples)

All samples using ModelType.LanguageModels leverage CUDA via OnnxRuntimeGenAIChatClientFactory:

Solution Design

Architecture

`

Application Startup Flow

flowchart TD
    A[Application Launch] --> B[MainWindow.Activate]
    B --> C[Delay 1 second]
    C --> D{NVIDIA GPU<br/>Present?}
    D -->|No| E[Skip - No Impact]
    D -->|Yes| F{CUDA DLL<br/>Available?}
    F -->|Yes| G[Use Existing DLL]
    F -->|No| H[Download from NuGet<br/>Silent Background]
    H --> I[CUDA Ready for Use]
Loading

Settings Page UI Components

flowchart LR
    A[Settings Page] --> B[GPU Detection]
    A --> C[Status Display]
    G --> D[Manual Download Button]
    G --> E[Progress Indicator]
    
    B --> F{NVIDIA GPU?}
    F -->|Yes| G[Show Controls]
    F -->|No| H[Show Info Message]
Loading

`

image

Implementation Components

1. CudaDllManager (AIDevGallery/Utils/CudaDllManager.cs)

  • GPU Detection: Uses existing DeviceUtils.GetEpDevices() to identify NVIDIA GPUs
  • Download Strategy: Pulls official Microsoft.ML.OnnxRuntimeGenAI.Cuda v0.10.1 from NuGet
  • Thread Safety: SemaphoreSlim ensures single concurrent download
  • Error Handling: All exceptions caught; graceful degradation to DirectML

2. MSBuild Exclusion (AIDevGallery/ExcludeExtraLibs.props)

3-stage exclusion ensures CUDA DLL removal:

  1. ResolvePackageAssets: Remove from NuGet package resolution
  2. CopyFilesToOutputDirectory: Delete from build output
  3. _GenerateAppxPackage: Exclude from MSIX payload

Platform-Specific: Only applies to win-x64; ARM64 unaffected

3. User Experience

  • Automatic Background Download: Starts 1 second post-launch (non-blocking)
  • Settings UI: "GPU Acceleration" card shows status and manual controls
  • Progress Feedback: Real-time download progress with cancellation support

Testing

  • Non-NVIDIA Users: No impact on DirectML execution path
  • NVIDIA Users: Automatic CUDA availability after first download
  • Offline Scenarios: Graceful degradation; no blocking behaviors
  • Existing OnnxRuntime: All required DLLs preserved (onnxruntime-genai.dll, etc.)

Benefits

  • 11% smaller downloads: Faster installation, reduced bandwidth usage
  • Faster startup: Non-NVIDIA users skip unnecessary DLL loading
  • Transparent acceleration: NVIDIA users get CUDA automatically
  • User control: Manual download option in Settings

Risk Assessment

Low-Risk Characteristics

  • Non-invasive: Zero deletions of existing code
  • Optional feature: Download failures don't affect app functionality
  • Reversible: Can be safely reverted if issues arise
  • Platform-isolated: Only affects x64 Windows builds

Package Size Reduction: 160 MB → 142 MB (-18 MB / -11%)
Breaking Changes: None
Dependencies Added: None

Milly Wei (from Dev Box) added 3 commits December 31, 2025 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants