Skip to content

EKS karpenter integration issues #10

@pingxiaoaws

Description

@pingxiaoaws

Background
The mock-device-plugin is essential for Karpenter integration with HAMi, as it registers GPU resource labels in node capacity (e.g., nvidia.com/gpucores), allowing Karpenter to properly recognize node initialization status and perform disruption operations.
However, we discovered three critical bugs that prevent the plugin from working correctly:

Problem 1: Incorrect parsing method in GetNodeDevices

Location: nvidia/device.go:101-105

Issue Description: The GetNodeDevices method attempts to parse device information from the node annotation hami.io/node-nvidia-register, but it uses the wrong parsing function:

Current implementation uses device.UnMarshalNodeDevices(), which expects JSON format
The actual annotation value is in comma-separated string format
This causes UnMarshalNodeDevices to fail
Solution: Should use device.DecodeNodeDevices() instead of device.UnMarshalNodeDevices()

Reference Implementation:
The correct usage pattern:

go

// Use DecodeNodeDevices for comma-separated string format
devices, err := device.DecodeNodeDevices(annotationValue)
if err != nil {
    // handle error
}

Problem 2: Race condition between mock-device-plugin and HAMi device plugin
Issue Description:

  1. One-time initialization: The Initialize() function is only called once at startup
  2. External annotation dependency: AddResource() depends on node annotations like hami.io/node-nvidia-register
  3. No retry mechanism: If annotations don't exist during initialization, there's no subsequent retry attempt

Solution: Implement proper synchronization mechanism to handle the startup sequence and resource registration order between mock-device-plugin and HAMi device plugin.

Suggested approach:
Add startup delay or readiness check
Implement proper locking mechanism
Use leader election or coordination to avoid conflicts

Problem3: Static Resource Count Problem
p.Count is fixed during initialization, and even if node annotations are updated later, ListAndWatch continues to report the same
number of devices.

go
// In mock/server.go
func (p *MockPlugin) ListAndWatch(...) error {
devs := make([]*kubeletdevicepluginv1beta1.Device, p.Count) // p.Count is fixed
// ...
s.Send(&kubeletdevicepluginv1beta1.ListAndWatchResponse{Devices: devs})
for {
time.Sleep(time.Second * 10)
s.Send(&kubeletdevicepluginv1beta1.ListAndWatchResponse{Devices: devs}) // Always sends same device list
}
}

Root Cause: The mock device plugin lacks a dynamic update mechanism. While kubelet does use updated allocatable and capacity values, it requires the device plugin
to actively send updated device lists through the ListAndWatch stream. The current implementation missing this capability to:

  1. Dynamically monitor node annotation changes
  2. Recalculate resource counts
  3. Send updated device lists via ListAndWatch

Environment
Karpenter version: [1.8.2]
Kubernetes version: [1.34]
HAMi version: [2.7.1 ]
mock-device-plugin version: latest from main branch
Impact
These bugs prevent the mock-device-plugin from functioning correctly in production environments with Karpenter, blocking critical node scale-down operations.

Thank you for your attention to these issues!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions