We currently have CUDA capabilities per kernel. Some kernels use different capabilities for different source files. I'm not sure whether we want to support this (since in these cases, the kernel could also be split into two with a separate entry point).