-
Notifications
You must be signed in to change notification settings - Fork 356
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Required prerequisites
- I have read the documentation https://tilelang.com.
- I have searched the Issue Tracker that this hasn't already been reported. (comment there if it has.)
What version of TileLang are you using?
0.1.7+cuda.git5acaab76
System information
Python: 3.12.12
TileLang: 0.1.7+cuda.git5acaab76
torch: 2.9.0+cu128
Problem description
I run into a torch.AcceleratorError: CUDA error: unspecified launch failure when running the example_mla_decode_persistent.py script.
After investigating, I found that the root cause is that the default value of execution_backend is tvm_ffi, which enables enable_host_codegen. However, the host codegen path does not include logic for launching cooperative kernels (i.e., it does not call cudaLaunchCooperativeKernel). As a result, kernels that rely on cooperative launch fail at runtime, leading to the unspecified launch failure.
Reproducible example code
The Python snippets:
python examples/deepseek_mla/example_mla_decode_persistent.pyTraceback
Expected behavior
No response
Additional context
No response
coderabbitai
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working