[Enhancement] Refactor KernelCache to use inheritance-based design #1483

sgjzfzzf · 2025-12-20T09:49:14Z

This PR resolves #1469. It moves most of the if-else checkers for multi-backends in tilelang/cache/kernel_cache.py, and dispatches the different implementations by inheritance in different subclasses.

We retained some specialized implementations for CuteDSL and considered removing them in the future once this backend is ready.

Summary by CodeRabbit

New Features
- Added CuTeDSL backend plus backend-specific cache implementations for TVM FFI, Cython, NVRTC, Torch and CuTeDSL.
- Centralized dispatch routing for selecting the appropriate backend at runtime.
Changes
- execution_backend options updated (removed "auto"; "tvm_ffi" is the default; "cutedsl" added).
- Improved on-disk cache handling, path/IO consistency, and adapter-specific save/load flows.
Tests / CI
- Added CUDA JIT test runs including a cache-disabled pass to catch cache-related regressions.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-20T09:49:23Z

📝 Walkthrough

Walkthrough

Refactors kernel caching from a singleton to a dispatch map selecting backend-specific KernelCache subclasses; splits disk I/O into helper methods on KernelCache; adds backend adapters (TVM FFI, NVRTC, CuTeDSL, Cython, Torch) implementing backend-specific save/load logic; updates cached(...) routing and CI to run cache/no-cache CUDA JIT tests.

Changes

Cohort / File(s)	Summary
Cache dispatcher & public API `tilelang/cache/__init__.py`	Replaced singleton `_kernel_cache_instance` with `_dispatch_map`; `cached(...)` now resolves target/execution_backend, applies env/defaults, logs resolution when verbose, and routes to backend-specific cache; updated `execution_backend` literal to include `cutedsl`.
Base KernelCache refactor `tilelang/cache/kernel_cache.py`	Moved module-level path constants into `KernelCache` instance attributes; added helpers `_save_kernel_source_code_to_disk`, `_save_wrapper_kernel_code_to_disk`, `_save_so_cubin_to_disk`, `_get_required_files`, `_load_kernel_source`, `_set_adapter_cache_path`, `_build_kernel`; `_save_kernel_to_disk`/`_load_kernel_from_disk` now delegate to helpers.
TVM FFI adapter `tilelang/jit/adapter/kernel_cache.py`	Added `TVMFFIKernelCache` subclass with `kernel_lib_path = "executable.so"` and implementations for `_save_wrapper_kernel_code_to_disk` and `_save_so_cubin_to_disk`.
NVRTC adapter `tilelang/jit/adapter/nvrtc/kernel_cache.py`	Added `NVRTCKernelCache` subclass (`kernel_lib_path = "kernel.cubin"`, `kernel_py_path = "kernel.py"`) that writes the Python wrapper prior to saving the binary.
CuTeDSL adapter `tilelang/jit/adapter/cutedsl/kernel_cache.py`	Added `CuTeDSLKernelCache` subclass (paths: `host_kernel_path`, `launcher_lib_path`, `launcher_cpp_path`); no-op device source save; saves launcher lib/cpp; custom `_get_required_files`, `_load_kernel_source` (returns empty), `_set_adapter_cache_path`, and `_build_kernel` that can create `JITKernel` from DB.
Cython adapter `tilelang/jit/adapter/cython/kernel_cache.py`	Added `CythonKernelCache` subclass inheriting `KernelCache` (no overrides).
Torch adapter `tilelang/jit/adapter/torch/kernel_cache.py`	Added `TorchKernelCache` subclass inheriting `KernelCache` (no overrides).
CI updates `.github/workflows/ci.yml`	Added CUDA JIT test steps (cache-enabled and cache-disabled passes) and generalized ignore patterns for CUDA JIT tests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant Dispatcher as CacheDispatcher (_dispatch_map)
  participant Backend as KernelCache (backend-specific)
  participant Adapter
  participant FS as Filesystem

  Caller->>Dispatcher: cached(func, target, execution_backend, ...)
  alt backend found
    Dispatcher->>Backend: select backend instance
    Dispatcher->>Backend: cached(..., set cache path)
    Backend->>Backend: _set_adapter_cache_path(kernel, cache_path)
    Backend->>Adapter: request sources/lib/params
    Backend->>FS: _get_required_files(cache_path)
    opt files missing
      Backend->>FS: _save_kernel_source_code_to_disk(...)
      Backend->>FS: _save_wrapper_kernel_code_to_disk(...)
      Backend->>FS: _save_so_cubin_to_disk(...)
    end
    Backend->>Backend: _build_kernel(...) -> JITKernel | None
    Backend-->>Caller: return JITKernel or None
  else unknown backend
    Dispatcher-->>Caller: raise ValueError
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(cutedsl): add CuTeDSL backend #1421 — overlapping CuTeDSL backend implementation and related cache/adapter changes.
[Misc] add env for default target/backend/verbose #1512 — touches cached(...) signature and env/default resolution logic that this change modifies.
[Refactor] Refactor env into a more flexible version #740 — related work on cache environment handling and cache enable/disable semantics.

Suggested reviewers

LeiWang1999

Poem

"I hopped through code with whiskers twitching,
dispatch maps laid out, no more singleton hitching.
Each backend a burrow, paths neat and tidy,
kernels tucked safe, build steps not mighty.
— a rabbit, caching carrots 🥕🐇"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.28% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly summarizes the main change: refactoring KernelCache to use an inheritance-based design.
Linked Issues check	✅ Passed	The pull request addresses all objectives from issue #1469: eliminates if-else branching via dispatch-based routing in init.py, introduces inheritance-based subclasses for backend-specific behavior, enables adding new backends without modifying core logic, and preserves correct behavior across existing backends.
Out of Scope Changes check	✅ Passed	All changes are scoped to the kernel cache refactoring objectives. Updates to .github/workflows/ci.yml for kernel cache testing are necessary validation of the refactored system and are in scope.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6195770 and 631861c.

📒 Files selected for processing (2)

tilelang/cache/__init__.py
tilelang/cache/kernel_cache.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

tilelang/cache/__init__.py

🪛 Ruff (0.14.10)

tilelang/cache/kernel_cache.py

439-439: Unused method argument: kernel

(ARG002)

439-439: Unused method argument: cache_path

(ARG002)

tilelang/cache/__init__.py

86-86: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)

🔇 Additional comments (8)

tilelang/cache/__init__.py (4)

5-18: LGTM: Clean import structure with proper circular import handling.

The use of TYPE_CHECKING for the base KernelCache import avoids circular dependencies while maintaining type hint accuracy. All backend-specific subclasses are imported and will be used in the dispatch map.

20-27: LGTM: Dispatch map correctly implements backend routing.

The _dispatch_map provides clean separation of backend-specific caching logic. Each backend has its own singleton instance, which is appropriate for the cache pattern.

44-72: Well-structured environment variable handling and backend resolution.

The centralized environment variable processing at lines 44-51 correctly establishes a single source of truth. The target normalization and backend resolution flow (lines 53-59) properly handles the "auto" case via resolve_execution_backend. The conditional logging (lines 60-72) provides useful diagnostics without noise when the user explicitly specified a backend.

73-86: Dispatch logic correctly routes to backend-specific cache implementations.

The dispatch check (line 73) and delegation (lines 74-84) properly route kernel caching to the appropriate backend. The error handling (lines 85-86) provides a clear message when an unsupported backend is requested.

tilelang/cache/kernel_cache.py (4)

39-42: LGTM: Instance path attributes enable backend-specific customization.

The introduction of instance-level path attributes (device_kernel_path, host_kernel_path, kernel_lib_path, params_path) provides a clean extension point for backend-specific subclasses to override file naming conventions (e.g., .cubin for NVRTC vs. .so for TVM FFI).

263-310: LGTM: Clean delegation to extensible helper methods.

The refactoring of _save_kernel_to_disk to delegate to _save_kernel_source_code_to_disk, _save_wrapper_kernel_code_to_disk, and _save_so_cubin_to_disk provides clear extension points for backend-specific subclasses while maintaining proper error handling.

312-376: LGTM: Well-structured load flow with clear separation of concerns.

The refactored _load_kernel_from_disk cleanly separates:

Required file validation via _get_required_files() (line 346)

Optional source loading via _load_kernel_source() (line 352)

Parameter deserialization (lines 354-362)

Kernel reconstruction via _build_kernel() (lines 364-376)

This provides clear extension points for subclasses to customize each phase.

439-441: Static analysis false positive: unused arguments are intentional.

The static analysis warning about unused kernel and cache_path parameters is a false positive. This is a base implementation designed to be overridden by subclasses (e.g., NVRTCKernelCache may set adapter state using these parameters). The parameters must remain for interface consistency.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-12-20T09:49:24Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

sgjzfzzf · 2025-12-20T09:49:30Z

I tested them on CUDA with tvm-ffi, cython, and ctypes, and it works very well(except for one bug that existed in the previous version, which leads to an exception on ctypes) on the environment I have. They may need more tests on other backends.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tilelang/cache/__init__.py (1)
39-39: Type hint is inconsistent with dispatch pool keys.

The Literal type is missing "cutedsl" which exists in _dispatch_pool. Also, None is allowed by the type hint but would cause a KeyError at runtime since there's no None key in the pool.
Suggested fix
-    execution_backend: Literal["auto", "tvm_ffi", "ctypes", "cython", "nvrtc", "torch"] | None = "auto",
+    execution_backend: Literal["auto", "tvm_ffi", "ctypes", "cython", "nvrtc", "cutedsl", "torch"] = "auto",

🧹 Nitpick comments (6)

tilelang/cache/torch_kernel_cache.py (1)
4-6: Superfluous *args, **kwargs in __new__ signature.

The base class KernelCache.__new__(cls) does not accept additional arguments. Since these cache classes are instantiated without arguments in the dispatch pool, the *args, **kwargs parameters are unnecessary and could be removed for clarity.

This applies to all similar subclasses in this PR.
Suggested simplification
 class TorchKernelCache(KernelCache):
-    def __new__(cls, *args, **kwargs):
-        return super().__new__(cls, *args, **kwargs)
+    def __new__(cls):
+        return super().__new__(cls)
tilelang/cache/__init__.py (1)
47-57: Consider adding a guard for invalid backend keys.

If an unsupported execution_backend value is passed, _dispatch_pool[execution_backend] will raise a raw KeyError. A more descriptive error could improve debuggability.
Optional: Add validation with clearer error
+    if execution_backend not in _dispatch_pool:
+        raise ValueError(
+            f"Unsupported execution_backend '{execution_backend}'. "
+            f"Allowed: {list(_dispatch_pool.keys())}"
+        )
     return _dispatch_pool[execution_backend].cached(
tilelang/cache/tvm_ffi_kernel_cache.py (1)
10-11: Remove redundant __new__ method.

The __new__ method simply delegates to the parent without any custom logic. Since the parent KernelCache already implements singleton behavior, this override is unnecessary.
🔎 Proposed fix
-    def __new__(cls, *args, **kwargs):
-        return super().__new__(cls, *args, **kwargs)
-
tilelang/cache/nvrtc_kernel_cache.py (1)
11-12: Remove redundant __new__ method.

The __new__ method simply delegates to the parent without any custom logic, making it unnecessary.
🔎 Proposed fix
-    def __new__(cls, *args, **kwargs):
-        return super().__new__(cls, *args, **kwargs)
-
tilelang/cache/cutedsl_kernel_cache.py (2)
16-17: Remove redundant __new__ method.

The __new__ method simply delegates to the parent without any custom logic, making it unnecessary.
🔎 Proposed fix
-    def __new__(cls, *args, **kwargs):
-        return super().__new__(cls, *args, **kwargs)
-
42-43: Consider using unpacking for list concatenation.

As suggested by the static analysis tool, unpacking provides cleaner syntax for combining lists.
🔎 Proposed refactor using unpacking
     def _get_required_files(self, cache_path: str) -> list[str]:
-        return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+        return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 168aec7 and 334a13d.

📒 Files selected for processing (9)

tilelang/cache/__init__.py (2 hunks)
tilelang/cache/auto_kernel_cache.py (1 hunks)
tilelang/cache/ctypes_kernel_cache.py (1 hunks)
tilelang/cache/cutedsl_kernel_cache.py (1 hunks)
tilelang/cache/cython_kernel_cache.py (1 hunks)
tilelang/cache/kernel_cache.py (4 hunks)
tilelang/cache/nvrtc_kernel_cache.py (1 hunks)
tilelang/cache/torch_kernel_cache.py (1 hunks)
tilelang/cache/tvm_ffi_kernel_cache.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (9)

tilelang/cache/cython_kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/cache/torch_kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/cache/tvm_ffi_kernel_cache.py (3)

tilelang/cache/kernel_cache.py (4)

KernelCache (25-449)

_save_wrapper_kernel_code_to_disk (414-418)

_safe_write_file (256-263)

_save_so_cubin_to_disk (420-425)

tilelang/cache/cutedsl_kernel_cache.py (1)

_save_so_cubin_to_disk (24-39)

tilelang/cache/nvrtc_kernel_cache.py (1)

_save_so_cubin_to_disk (23-30)

tilelang/cache/cutedsl_kernel_cache.py (3)

tilelang/cache/kernel_cache.py (6)

KernelCache (25-449)

_save_kernel_source_code_to_disk (407-412)

_save_so_cubin_to_disk (420-425)

_safe_write_file (256-263)

_load_binary (250-253)

_get_required_files (427-430)

tilelang/cache/nvrtc_kernel_cache.py (2)

_save_kernel_source_code_to_disk (14-21)

_save_so_cubin_to_disk (23-30)

tilelang/cache/tvm_ffi_kernel_cache.py (1)

_save_so_cubin_to_disk (19-24)

tilelang/cache/auto_kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/cache/ctypes_kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/cache/kernel_cache.py (6)

tilelang/cache/cutedsl_kernel_cache.py (3)

_save_kernel_source_code_to_disk (20-21)

_save_so_cubin_to_disk (24-39)

_get_required_files (42-43)

tilelang/cache/tvm_ffi_kernel_cache.py (2)

_save_wrapper_kernel_code_to_disk (13-17)

_save_so_cubin_to_disk (19-24)

tilelang/jit/kernel.py (3)

JITKernel (39-794)

kernel_source (644-645)

get_kernel_source (465-476)

tilelang/jit/adapter/cutedsl/adapter.py (1)

get_kernel_source (206-214)

tilelang/jit/adapter/nvrtc/adapter.py (1)

get_kernel_source (179-190)

tilelang/jit/adapter/ctypes/adapter.py (2)

get_kernel_source (296-302)

libpath (282-284)

tilelang/cache/__init__.py (8)

tilelang/cache/auto_kernel_cache.py (1)

AutoKernelCache (4-6)

tilelang/cache/ctypes_kernel_cache.py (1)

CTypesKernelCache (4-6)

tilelang/cache/cutedsl_kernel_cache.py (1)

CuTeDSLKernelCache (10-49)

tilelang/cache/cython_kernel_cache.py (1)

CythonKernelCache (4-6)

tilelang/cache/nvrtc_kernel_cache.py (1)

NVRTCKernelCache (7-30)

tilelang/cache/torch_kernel_cache.py (1)

TorchKernelCache (4-6)

tilelang/cache/tvm_ffi_kernel_cache.py (1)

TVMFFIKernelCache (7-24)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/cache/nvrtc_kernel_cache.py (2)

tilelang/cache/kernel_cache.py (4)

_save_kernel_source_code_to_disk (407-412)

_safe_write_file (256-263)

_load_binary (250-253)

_save_so_cubin_to_disk (420-425)

tilelang/cache/tvm_ffi_kernel_cache.py (1)

_save_so_cubin_to_disk (19-24)

🪛 Ruff (0.14.8)

tilelang/cache/cutedsl_kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)

🔇 Additional comments (7)

tilelang/cache/auto_kernel_cache.py (1)

4-6: LGTM, same note as TorchKernelCache.

The same optional simplification of removing *args, **kwargs applies here.

tilelang/cache/ctypes_kernel_cache.py (1)

4-6: LGTM, same note as TorchKernelCache.

The same optional simplification of removing *args, **kwargs applies here.

tilelang/cache/cython_kernel_cache.py (1)

4-6: LGTM, same note as TorchKernelCache.

The same optional simplification of removing *args, **kwargs applies here.

tilelang/cache/__init__.py (2)

22-30: Dispatch pool design looks good for the multi-backend refactor.

The dictionary-based dispatch cleanly replaces the if-else pattern as intended by issue #1469. Each backend gets its own singleton cache instance.

75-76: Verify this module-level behavior is intentional.

If TILELANG_CLEAR_CACHE env var is set, this will raise RuntimeError during import, preventing any use of the module. This seems intentional as a safeguard, but confirm this is the desired UX.

tilelang/cache/kernel_cache.py (2)

39-45: Good refactoring: centralized path configuration.

Moving from module-level path constants to per-class attributes enables backend-specific path customization without modifying the base class. This aligns well with the PR's goal of eliminating conditional logic.

407-449: Well-structured I/O abstraction.

The new helper methods (_save_kernel_source_code_to_disk, _save_wrapper_kernel_code_to_disk, _save_so_cubin_to_disk, _get_required_files, _load_kernel_source) provide clean extension points for subclasses while centralizing common disk I/O patterns. This significantly improves maintainability.

tilelang/jit/adapter/cutedsl/kernel_cache.py

tilelang/jit/adapter/nvrtc/kernel_cache.py

tilelang/jit/adapter/kernel_cache.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tilelang/cache/__init__.py (1)
47-58: Consider defensive error handling for invalid backend keys.

The direct dictionary lookup on line 48 will raise a KeyError if an invalid backend_key is provided. While the Literal type hint provides compile-time protection, runtime validation would offer better error messages when type checking is bypassed or when backend names are dynamically constructed.
🔎 Proposed enhancement
 backend_key: str = execution_backend if execution_backend is not None else "auto"
+if backend_key not in _dispatch_pool:
+    valid_backends = ", ".join(_dispatch_pool.keys())
+    raise ValueError(
+        f"Invalid execution_backend '{backend_key}'. "
+        f"Valid options are: {valid_backends}"
+    )
 return _dispatch_pool[backend_key].cached(

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 334a13d and 676b4d0.

📒 Files selected for processing (2)

examples/gemm/example_gemm.py (1 hunks)
tilelang/cache/__init__.py (2 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

examples/gemm/example_gemm.py

📚 Learning: 2025-12-18T04:50:00.512Z

Learnt from: silentCoder-dev
Repo: tile-ai/tilelang PR: 1464
File: testing/python/language/test_tilelang_language_rand.py:14-14
Timestamp: 2025-12-18T04:50:00.512Z
Learning: In `testing/python/language/test_tilelang_language_rand.py`, the TileLang kernel uses `blk_M = M` (single block) and calls `rng_rand()` four times per element to align results with the Triton implementation, which uses `blk_M = 128` (multiple blocks) and calls the RNG once per element. These differences compensate for internal RNG behavior differences between TileLang and Triton.

Applied to files:

examples/gemm/example_gemm.py

🧬 Code graph analysis (1)

examples/gemm/example_gemm.py (1)

tilelang/jit/kernel.py (1)

out_idx (636-637)

🔇 Additional comments (2)

tilelang/cache/__init__.py (1)

5-20: Clean TYPE_CHECKING usage to avoid circular imports.

The use of TYPE_CHECKING to import KernelCache only for type hints is a good pattern that avoids potential circular import issues at runtime while maintaining type safety.

examples/gemm/example_gemm.py (1)

5-5: Good example demonstrating the new backend selection API.

The addition of execution_backend="ctypes" effectively demonstrates the new multi-backend dispatch mechanism. Based on the PR description, this has been tested successfully on CUDA.

tilelang/cache/__init__.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

tilelang/cache/cutedsl_kernel_cache.py (2)
16-17: Remove unnecessary __new__ override.

The __new__ method simply delegates to the parent without adding any custom logic. Python automatically uses the parent's __new__ if not overridden, making this dead code.
🔎 Proposed fix
-    def __new__(cls, *args, **kwargs):
-        return super().__new__(cls, *args, **kwargs)
-
41-43: Consider using iterable unpacking for list extension.

The list concatenation works correctly, but iterable unpacking is more idiomatic and slightly more efficient.
🔎 Proposed fix
     @override
     def _get_required_files(self, cache_path: str) -> list[str]:
-        return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+        return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 676b4d0 and e28dba1.

📒 Files selected for processing (1)

tilelang/cache/cutedsl_kernel_cache.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tilelang/cache/cutedsl_kernel_cache.py (1)

tilelang/cache/kernel_cache.py (7)

KernelCache (25-449)

_save_kernel_source_code_to_disk (407-412)

_save_so_cubin_to_disk (420-425)

_safe_write_file (256-263)

_load_binary (250-253)

_get_required_files (427-430)

_load_kernel_source (432-449)

🪛 Ruff (0.14.8)

tilelang/cache/cutedsl_kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)

🔇 Additional comments (3)

tilelang/cache/cutedsl_kernel_cache.py (3)

19-21: Verify no-op implementation is intentional.

The method intentionally skips saving kernel source code, deviating from the base class behavior. Confirm this is the correct design for CuTeDSL—does the backend not generate saveable kernel source, or is source code unnecessary for cache restoration?

23-39: LGTM! Launcher artifact handling is well-implemented.

The method correctly saves CuTeDSL-specific launcher artifacts (library and optional C++ source) using defensive attribute checks and the parent class's safe write utilities.

45-47: Verify empty string return values are correct.

The method returns empty strings rather than None. The base class can return None to indicate loading failure. Since _save_kernel_source_code_to_disk is a no-op and no source is saved, confirm that returning empty strings (rather than None) is the expected behavior for downstream code.

Note: The previous review issue regarding method naming (_load_kernel_from_cache vs _load_kernel_source) has been correctly resolved.

LeiWang1999

Thanks, would it be possible for us to move cache into adapter class itself in https://github.com/tile-ai/tilelang/tree/main/tilelang/jit/adapter ?

examples/gemm/example_gemm.py

sgjzfzzf · 2025-12-20T19:58:02Z

Thanks, would it be possible for us to move cache into adapter class itself in https://github.com/tile-ai/tilelang/tree/main/tilelang/jit/adapter ?

For sure I could do that.

tilelang/cache/auto_kernel_cache.py

tilelang/cache/__init__.py

sgjzfzzf · 2025-12-21T07:26:41Z

Thanks, would it be possible for us to move cache into adapter class itself in https://github.com/tile-ai/tilelang/tree/main/tilelang/jit/adapter ?

@LeiWang1999 Hi Lei! I tried to do that, but a horrible point is that the Adapter construction depends on the KernelCache. It'll lead to a cycle reference if we put KernelCache inside Adapter. Do you want me to refactor the whole cached part on the kernel construction, or just move KernelCache codes into the Adapter directory?

silentCoder-dev · 2025-12-21T07:45:19Z

Thanks, would it be possible for us to move cache into adapter class itself in https://github.com/tile-ai/tilelang/tree/main/tilelang/jit/adapter ?

@LeiWang1999 Hi Lei! I tried to do that, but a horrible point is that the Adapter construction depends on the KernelCache. It'll lead to a cycle reference if we put KernelCache inside Adapter. Do you want me to refactor the whole cached part on the kernel construction, or just move KernelCache codes into the Adapter directory?

It seems sufficient to move this into adapter to better support maintaining multiple backends.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

tilelang/jit/adapter/nvrtc/kernel_cache.py (1)

14-30: Code duplication between methods.

Lines 15-20 and 24-29 contain identical logic for saving the Python wrapper file. This was previously flagged.
tilelang/jit/adapter/kernel_cache.py (1)
19-24: Fix logging to display file path instead of object.

Line 23 logs the executable object instead of kernel_lib_path. The debug message should show where the file is being saved (the path), not the object being saved.
🔎 Proposed fix
     def _save_so_cubin_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
         kernel_lib_path = os.path.join(cache_path, self.kernel_lib_path)
         executable = kernel.adapter.executable
         if verbose:
-            self.logger.debug(f"Saving kernel executable to file: {executable}")
+            self.logger.debug(f"Saving kernel executable to file: {kernel_lib_path}")
         KernelCache._safe_write_executable(executable, kernel_lib_path)

🧹 Nitpick comments (3)

tilelang/jit/adapter/cutedsl/kernel_cache.py (2)
41-43: Consider iterable unpacking for clarity.

Per static analysis (RUF005), prefer unpacking over concatenation for combining lists.
🔎 Suggested refactor
     @override
     def _get_required_files(self, cache_path: str) -> list[str]:
-        return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+        return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]
45-47: Returning empty strings works but is subtly different from None.

The base class's _load_kernel_from_disk condition if ((host_kernel_source and device_kernel_source) or self.execution_backend == "cutedsl") treats empty strings as falsy, so this works due to the cutedsl backend check. However, returning (None, None) would be more explicit and consistent with the base class's error-handling pattern that returns None on failure.
🔎 Suggested alternative
     @override
     def _load_kernel_source(self, device_kernel_path: str, host_kernel_path: str, verbose: bool = False) -> tuple[str | None, str | None]:
-        return "", ""
+        return None, None
tilelang/jit/adapter/kernel_cache.py (1)

10-11: Consider whether new override is necessary.

The __new__ method simply delegates to the superclass without additional logic. While this pattern is consistent across all backend implementations in the codebase (CTypes, Cython, Torch, NVRTC), it appears redundant if no customization is needed.

If this pattern serves a specific purpose (e.g., enforcing singleton behavior or future extensibility), consider adding a comment explaining the rationale. Otherwise, the method could be removed to reduce boilerplate.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e28dba1 and 81dd6eb.

📒 Files selected for processing (7)

tilelang/cache/__init__.py (2 hunks)
tilelang/jit/adapter/ctypes/kernel_cache.py (1 hunks)
tilelang/jit/adapter/cutedsl/kernel_cache.py (1 hunks)
tilelang/jit/adapter/cython/kernel_cache.py (1 hunks)
tilelang/jit/adapter/kernel_cache.py (1 hunks)
tilelang/jit/adapter/nvrtc/kernel_cache.py (1 hunks)
tilelang/jit/adapter/torch/kernel_cache.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

tilelang/jit/adapter/cython/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/jit/adapter/ctypes/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/jit/adapter/cutedsl/kernel_cache.py (3)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/jit/adapter/nvrtc/kernel_cache.py (2)

_save_kernel_source_code_to_disk (14-21)

_save_so_cubin_to_disk (23-30)

tilelang/jit/adapter/kernel_cache.py (1)

_save_so_cubin_to_disk (19-24)

tilelang/jit/adapter/torch/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-449)

tilelang/cache/__init__.py (5)

tilelang/jit/adapter/ctypes/kernel_cache.py (1)

CTypesKernelCache (4-6)

tilelang/jit/adapter/cython/kernel_cache.py (1)

CythonKernelCache (4-6)

tilelang/jit/adapter/nvrtc/kernel_cache.py (1)

NVRTCKernelCache (7-30)

tilelang/jit/adapter/torch/kernel_cache.py (1)

TorchKernelCache (4-6)

tilelang/jit/adapter/kernel_cache.py (1)

TVMFFIKernelCache (7-24)

🪛 Ruff (0.14.8)

tilelang/jit/adapter/cutedsl/kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

tilelang/cache/__init__.py

61-61: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

🔇 Additional comments (13)

tilelang/jit/adapter/cython/kernel_cache.py (1)

4-6: Minimal adapter follows established pattern.

The subclass correctly delegates to the base KernelCache. Note that the singleton implementation in the base class stores the instance on KernelCache._instance directly, meaning all backend-specific subclasses share the same singleton—this appears intentional for a unified cache but is worth being aware of.

tilelang/jit/adapter/torch/kernel_cache.py (1)

4-6: LGTM!

The TorchKernelCache subclass correctly delegates to the base class and follows the consistent adapter pattern established across backends.

tilelang/jit/adapter/nvrtc/kernel_cache.py (1)

7-12: Class attributes and new are correctly defined.

The kernel_lib_path = "kernel.cubin" appropriately overrides the base class default for the NVRTC backend.

tilelang/jit/adapter/cutedsl/kernel_cache.py (3)

10-17: Class attributes and structure are well-defined.

The CuTeDSL-specific paths (launcher_lib.so, launcher.cpp) are appropriately declared as class attributes.

19-21: No-op override is intentional for CuTeDSL.

The CuTeDSL backend doesn't persist kernel source in the traditional manner, relying on launcher artifacts instead.

23-39: Defensive attribute access is appropriate.

Good use of getattr and hasattr to safely check for optional lib_generator and launcher_cpp_code attributes before attempting to save.

tilelang/jit/adapter/ctypes/kernel_cache.py (1)

4-6: LGTM!

The CTypesKernelCache subclass follows the consistent minimal adapter pattern.

tilelang/cache/__init__.py (4)

5-18: LGTM! Clean TYPE_CHECKING usage prevents circular imports.

The TYPE_CHECKING conditional import pattern properly isolates the KernelCache type reference for type checking while avoiding runtime circular dependencies. All backend implementations are imported and ready for the dispatch pool.

20-28: Well-structured dispatch pool with singleton instances.

The dispatch pool architecture cleanly maps backend names to their singleton KernelCache implementations. This eliminates the conditional branching that was present in the previous design.

45-59: Dispatch logic correctly routes to backend implementations.

The normalization of "auto"/None to "tvm_ffi" provides a sensible default, and all parameters are properly forwarded to the selected backend's cached() method.

60-61: Exception handling addresses previous feedback.

The ValueError for unsupported backends implements the behavior requested in past review comments. The inline error message is clear and actionable.

Note: Ruff suggests moving the message to an exception subclass (TRY003), but this inline approach is acceptable for a simple validation error.

The static analysis hint is optional; if you prefer to follow strict style guidelines, consider defining a custom exception class, but the current implementation is functionally correct.

tilelang/jit/adapter/kernel_cache.py (2)

7-8: LGTM! Backend-specific library path is properly defined.

The TVMFFIKernelCache subclass appropriately sets kernel_lib_path = "executable.so" for the TVM FFI backend, following the pattern used by other backend implementations (e.g., NVRTC uses "kernel.cubin").

13-17: LGTM! Wrapper kernel saving logic is correct.

The method properly constructs the host kernel path, logs the correct path variable when verbose, and safely writes the host source using the parent class helper method.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tilelang/cache/__init__.py (1)
37-37: Missing "cutedsl" in execution_backend type hint.

The _dispatch_pool includes "cutedsl" (line 26), but the type hint for execution_backend on line 37 omits it. This creates a type-checking inconsistency where the backend is available at runtime but not reflected in the type signature.
-    execution_backend: Literal["tvm_ffi", "ctypes", "cython", "nvrtc", "torch"] | None = None,
+    execution_backend: Literal["tvm_ffi", "ctypes", "cython", "nvrtc", "torch", "cutedsl"] | None = None,
Note: This issue was already flagged in a previous review comment.

🧹 Nitpick comments (1)

tilelang/cache/__init__.py (1)
58-58: Optional: Consider Ruff hint about exception message format.

Ruff suggests avoiding long messages directly in exception raises (TRY003). For this simple case, the current format is clear and maintainable, so this is purely a linting preference.

If you want to strictly follow the linter suggestion:
+_UNSUPPORTED_BACKEND_MSG = 'Cannot find support for execution backend "{}"'
+
 def cached(
     ...
     else:
-        raise ValueError(f'Cannot find support for execution backend "{execution_backend}"')
+        raise ValueError(_UNSUPPORTED_BACKEND_MSG.format(execution_backend))

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81dd6eb and d256f9b.

📒 Files selected for processing (1)

tilelang/cache/__init__.py (2 hunks)

🧰 Additional context used

🪛 Ruff (0.14.8)

tilelang/cache/__init__.py

58-58: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (2)

tilelang/cache/__init__.py (2)

20-28: Good refactoring: dispatch pool centralizes backend selection.

The dispatch pool cleanly maps backend keys to singleton instances, eliminating the if-else branching mentioned in the PR objectives. This design makes adding new backends straightforward.

10-15: All imported KernelCache implementations verified and exist. The refactored dispatch system has all six required backend-specific classes properly defined in their respective modules. No issues found.

tilelang/cache/__init__.py

tilelang/jit/adapter/nvrtc/kernel_cache.py

tilelang/cache/__init__.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

tilelang/cache/__init__.py (1)
37-37: Critical: Default execution_backend=None will always fail.

The parameter defaults to None, but None is not a key in _dispatch_pool (lines 21-28). Every call to cached() without an explicit execution_backend will raise ValueError at line 58, breaking the public API.

Based on past review comments, the old implementation supported "auto" mode for backend detection. This refactor removed that logic but didn't update the default value.
🔎 Recommended fixes

Option 1: Restore auto-detection for None (preserves backward compatibility)
+    # Auto-detect backend if not specified
+    if execution_backend is None:
+        # Determine the best available backend based on target
+        if isinstance(target, str) and target == "auto":
+            execution_backend = "tvm_ffi"  # or implement detection logic
+        else:
+            # Parse target to determine appropriate backend
+            execution_backend = "tvm_ffi"  # placeholder
+    
     if execution_backend in _dispatch_pool:
Option 2: Use a valid default backend
-    execution_backend: Literal["tvm_ffi", "ctypes", "cython", "nvrtc", "torch", "cutedsl"] | None = None,
+    execution_backend: Literal["tvm_ffi", "ctypes", "cython", "nvrtc", "torch", "cutedsl"] = "tvm_ffi",
Option 3: Make execution_backend required
-    execution_backend: Literal["tvm_ffi", "ctypes", "cython", "nvrtc", "torch", "cutedsl"] | None = None,
+    execution_backend: Literal["tvm_ffi", "ctypes", "cython", "nvrtc", "torch", "cutedsl"],

🧹 Nitpick comments (2)

tilelang/cache/__init__.py (2)

20-28: Consider lazy initialization of backend instances.

All backend instances are created at module import time, even if only one backend will be used. For better startup performance, consider lazy initialization:

🔎 Optional refactor for lazy initialization

-_dispatch_pool: dict[str, KernelCache] = {
-    "tvm_ffi": TVMFFIKernelCache(),
-    "ctypes": CTypesKernelCache(),
-    "cython": CythonKernelCache(),
-    "nvrtc": NVRTCKernelCache(),
-    "cutedsl": CuTeDSLKernelCache(),
-    "torch": TorchKernelCache(),
-}
+_dispatch_pool: dict[str, KernelCache] = {}
+
+def _get_backend_cache(backend: str) -> KernelCache:
+    """Get or create a backend cache instance."""
+    if backend not in _dispatch_pool:
+        backend_classes = {
+            "tvm_ffi": TVMFFIKernelCache,
+            "ctypes": CTypesKernelCache,
+            "cython": CythonKernelCache,
+            "nvrtc": NVRTCKernelCache,
+            "cutedsl": CuTeDSLKernelCache,
+            "torch": TorchKernelCache,
+        }
+        if backend in backend_classes:
+            _dispatch_pool[backend] = backend_classes[backend]()
+    return _dispatch_pool.get(backend)

Then update line 46 to use _get_backend_cache(execution_backend).

57-58: Consider extracting error message to improve maintainability.

Static analysis flags this as TRY003 (long message outside exception class). While the current message is clear and acceptable, extracting it can improve consistency if you add more validation errors later.

🔎 Optional refactor

+_UNSUPPORTED_BACKEND_ERROR = (
+    'Cannot find support for execution backend "{}". '
+    f'Supported backends: {", ".join(_dispatch_pool.keys())}'
+)
+
 def cached(
     ...
 ) -> JITKernel:
     ...
     if execution_backend in _dispatch_pool:
         ...
     else:
-        raise ValueError(f'Cannot find support for execution backend "{execution_backend}"')
+        raise ValueError(_UNSUPPORTED_BACKEND_ERROR.format(execution_backend))

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d256f9b and 2e3be15.

📒 Files selected for processing (2)

tilelang/cache/__init__.py
tilelang/jit/adapter/nvrtc/kernel_cache.py

🚧 Files skipped from review as they are similar to previous changes (1)

tilelang/jit/adapter/nvrtc/kernel_cache.py

🧰 Additional context used

🧬 Code graph analysis (1)

tilelang/cache/__init__.py (1)

tilelang/cache/kernel_cache.py (2)

KernelCache (25-449)

cached (113-227)

🪛 Ruff (0.14.8)

tilelang/cache/__init__.py

58-58: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)

🔇 Additional comments (1)

tilelang/cache/__init__.py (1)

5-18: LGTM! Clean import structure with proper TYPE_CHECKING usage.

The TYPE_CHECKING pattern correctly avoids circular import issues while maintaining type safety. The backend-specific imports are well-organized.

SiriusNEO

Overall LGTM, just some comments to go

tilelang/cache/__init__.py

tilelang/cache/kernel_cache.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

tilelang/jit/adapter/ctypes/kernel_cache.py (1)
1-4: Consider adding a docstring to document the placeholder implementation.

The empty class body is valid but would benefit from a docstring explaining that CTypesKernelCache currently uses all default KernelCache behavior and may be extended in the future with ctypes-specific logic.
🔎 Suggested documentation
+"""CTypes backend kernel cache implementation."""
 from tilelang.cache.kernel_cache import KernelCache


-class CTypesKernelCache(KernelCache): ...
+class CTypesKernelCache(KernelCache):
+    """
+    Kernel cache for the ctypes execution backend.
+    
+    Currently uses default KernelCache behavior. Backend-specific
+    customizations can be added here as needed.
+    """
+    ...
tilelang/jit/adapter/cython/kernel_cache.py (1)
1-4: Consider adding a docstring to document the placeholder implementation.

Similar to CTypesKernelCache, this empty class would benefit from documentation explaining it's a placeholder for Cython-specific cache behavior.
🔎 Suggested documentation
+"""Cython backend kernel cache implementation."""
 from tilelang.cache.kernel_cache import KernelCache


-class CythonKernelCache(KernelCache): ...
+class CythonKernelCache(KernelCache):
+    """
+    Kernel cache for the Cython execution backend.
+    
+    Currently uses default KernelCache behavior. Backend-specific
+    customizations can be added here as needed.
+    """
+    ...
tilelang/jit/adapter/cutedsl/kernel_cache.py (1)
41-43: Consider iterable unpacking for concatenation.

The list concatenation works correctly but could use iterable unpacking for slightly better performance.
🔎 Optional optimization
     def _get_required_files(self, cache_path: str) -> list[str]:
-        return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+        return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e3be15 and 1c0c714.

📒 Files selected for processing (8)

tilelang/cache/__init__.py
tilelang/cache/kernel_cache.py
tilelang/jit/adapter/ctypes/kernel_cache.py
tilelang/jit/adapter/cutedsl/kernel_cache.py
tilelang/jit/adapter/cython/kernel_cache.py
tilelang/jit/adapter/kernel_cache.py
tilelang/jit/adapter/nvrtc/kernel_cache.py
tilelang/jit/adapter/torch/kernel_cache.py

🚧 Files skipped from review as they are similar to previous changes (2)

tilelang/jit/adapter/torch/kernel_cache.py
tilelang/jit/adapter/kernel_cache.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

tilelang/cache/__init__.py

🧬 Code graph analysis (4)

tilelang/jit/adapter/cython/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-475)

tilelang/jit/adapter/ctypes/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-475)

tilelang/jit/adapter/nvrtc/kernel_cache.py (2)

tilelang/cache/kernel_cache.py (4)

KernelCache (25-475)

_save_so_cubin_to_disk (412-417)

_safe_write_file (253-260)

_load_binary (247-250)

tilelang/jit/adapter/kernel_cache.py (1)

_save_so_cubin_to_disk (16-21)

tilelang/cache/kernel_cache.py (1)

tilelang/jit/adapter/cutedsl/kernel_cache.py (6)

_set_adapter_cache_path (50-52)

_save_kernel_source_code_to_disk (20-21)

_save_so_cubin_to_disk (24-39)

_get_required_files (42-43)

_load_kernel_source (46-47)

_build_kernel (55-83)

🪛 Ruff (0.14.10)

tilelang/cache/__init__.py

58-58: Avoid specifying long messages outside the exception class

(TRY003)

tilelang/jit/adapter/cutedsl/kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

tilelang/cache/kernel_cache.py

443-443: Unused method argument: kernel

(ARG002)

443-443: Unused method argument: cache_path

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)

🔇 Additional comments (14)

tilelang/jit/adapter/nvrtc/kernel_cache.py (1)

7-18: LGTM! Clean NVRTC-specific implementation.

The class correctly customizes paths for NVRTC's cubin output and implements the necessary logic to persist the Python wrapper alongside the binary. The previous duplicate logic concern has been addressed.

tilelang/cache/__init__.py (3)

21-28: Clean dispatch mechanism successfully replaces singleton pattern.

The backend dispatch map provides a clear, extensible way to route kernel cache operations to backend-specific implementations. This aligns with the PR objective to eliminate if-else branching.

37-37: Good choice of default execution backend.

Setting the default to "tvm_ffi" addresses previous review concerns about None causing errors. This provides a sensible fallback while still allowing explicit backend selection.

Based on past review comments, this follows the suggestion to use "tvm_ffi" as the default for internal APIs.

45-58: Dispatch logic is straightforward and correct.

The if-else routing to backend-specific cache instances is clear. The ValueError provides a helpful message for unsupported backends.

Note: The static analysis hint (TRY003) about the long error message is a stylistic suggestion. The current inline message is acceptable and clear for this use case.

tilelang/jit/adapter/cutedsl/kernel_cache.py (4)

13-21: CuTeDSL-specific paths and no-op kernel source save are appropriate.

The custom paths for launcher artifacts align with CuTeDSL's C++ launcher design. The early return in _save_kernel_source_code_to_disk indicates CuTeDSL handles kernel sources differently than other backends.

23-39: Defensive attribute checking for launcher artifacts is good practice.

The use of getattr() and hasattr() ensures the code handles cases where the adapter or lib_generator may not have the expected launcher attributes. The verbose logging aids debugging.

45-52: Appropriate overrides for CuTeDSL's kernel loading behavior.

Returning empty strings from _load_kernel_source and propagating the cache path to the adapter reflect CuTeDSL's different approach to managing kernel sources and artifacts.

54-83: Correct kernel reconstruction from cached parameters.

The _build_kernel implementation properly delegates to JITKernel.from_database when parameters are available, which is the expected behavior for loading cached kernels.

tilelang/cache/kernel_cache.py (6)

39-43: Excellent use of class attributes for path customization.

Moving path constants from module-level to class attributes allows backend-specific subclasses to override paths as needed (e.g., NVRTC using "kernel.cubin" instead of "kernel_lib.so"). This is a key enabler of the inheritance-based design.

290-303: Well-structured delegation to specialized save methods.

Breaking _save_kernel_to_disk into separate calls to _save_kernel_source_code_to_disk, _save_wrapper_kernel_code_to_disk, and _save_so_cubin_to_disk provides clear override points for backend-specific behavior while maintaining error handling at the orchestration level.

351-380: Clean refactor of disk loading logic with extensible helper methods.

The extraction of _get_required_files, _load_kernel_source, and _build_kernel successfully eliminates conditional logic while enabling backends like CuTeDSL to customize file requirements and loading behavior.

399-441: New helper methods provide excellent override points for subclasses.

These methods establish the template method pattern, allowing each backend to customize specific aspects:

_save_kernel_source_code_to_disk: Device kernel persistence

_save_wrapper_kernel_code_to_disk: Host wrapper persistence

_save_so_cubin_to_disk: Library/binary persistence

_get_required_files: File existence checks

_load_kernel_source: Source code loading

This design successfully achieves the PR objective of eliminating if-else branching in favor of polymorphism.

443-444: Hook method with unused parameters is intentional design.

The _set_adapter_cache_path method serves as an extension point for subclasses (e.g., CuTeDSLKernelCache overrides this at line 50-52 of tilelang/jit/adapter/cutedsl/kernel_cache.py to propagate the cache path). The unused parameters in the base implementation are expected for hook methods.

The static analysis warnings for unused arguments (ARG002) are false positives in this context.

446-475: Flexible kernel reconstruction logic supports diverse backends.

The _build_kernel method provides a sensible default that works for most backends while allowing specialized subclasses (like CuTeDSL) to implement custom reconstruction logic based on available artifacts.

tilelang/cache/kernel_cache.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

tilelang/jit/adapter/kernel_cache.py (1)
16-21: Fix incorrect variable in log message.

Line 20 logs the Executable object instead of the destination path kernel_lib_path. The log should indicate where the file is being saved.
🔎 Proposed fix
     executable = kernel.adapter.executable
     if verbose:
-        self.logger.debug(f"Saving kernel executable to file: {executable}")
+        self.logger.debug(f"Saving kernel executable to file: {kernel_lib_path}")
     KernelCache._safe_write_executable(executable, kernel_lib_path)
tilelang/cache/kernel_cache.py (2)
406-410: Add null check for adapter's kernel source.

Some adapters (e.g., CuTeDSL, NVRTC) may return None from get_kernel_source(). Passing None to file.write() will raise a TypeError, which gets caught and logged with a generic error message that doesn't indicate the root cause.
🔎 Proposed fix
 def _save_wrapper_kernel_code_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
     host_kernel_path = os.path.join(cache_path, self.host_kernel_path)
+    source = kernel.adapter.get_kernel_source()
+    if source is None:
+        if verbose:
+            self.logger.debug("Skipping host kernel source save: adapter returned None")
+        return
     if verbose:
         self.logger.debug(f"Saving wrapped kernel source code to file: {host_kernel_path}")
-    KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(kernel.adapter.get_kernel_source()))
+    KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(source))
446-476: Use explicit None checks instead of truthiness for source validation.

Line 460 uses truthiness checks that conflate two distinct cases:

None: Source file not found (error condition)

"": Source file exists but is empty (valid for backends like CuTeDSL)

The CuTeDSL adapter confirms this is problematic—it intentionally returns "", "" from _load_kernel_source but must override _build_kernel to only check kernel_params because the base class truthiness check incorrectly rejects empty strings.
🔎 Recommended fix
-    if host_kernel_source and device_kernel_source and kernel_params:
+    if host_kernel_source is not None and device_kernel_source is not None and kernel_params is not None:
         return JITKernel.from_database(
This allows backends that don't use kernel source files to return empty strings while still detecting actual load failures (None).

🧹 Nitpick comments (1)

tilelang/jit/adapter/cutedsl/kernel_cache.py (1)
41-43: Consider using iterable unpacking for cleaner syntax.

The list concatenation works correctly but could be more idiomatic.
🔎 Proposed refactor
 def _get_required_files(self, cache_path: str) -> list[str]:
-    return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+    return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]
Based on static analysis hint.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c7e1fa and aba3d29.

📒 Files selected for processing (8)

tilelang/cache/__init__.py
tilelang/cache/kernel_cache.py
tilelang/jit/adapter/ctypes/kernel_cache.py
tilelang/jit/adapter/cutedsl/kernel_cache.py
tilelang/jit/adapter/cython/kernel_cache.py
tilelang/jit/adapter/kernel_cache.py
tilelang/jit/adapter/nvrtc/kernel_cache.py
tilelang/jit/adapter/torch/kernel_cache.py

🚧 Files skipped from review as they are similar to previous changes (2)

tilelang/jit/adapter/torch/kernel_cache.py
tilelang/jit/adapter/ctypes/kernel_cache.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

tilelang/cache/__init__.py

🧬 Code graph analysis (4)

tilelang/jit/adapter/cython/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (1)

KernelCache (25-476)

tilelang/jit/adapter/cutedsl/kernel_cache.py (5)

tilelang/language/ast/ir.py (1)

target (1677-1707)

tilelang/cache/kernel_cache.py (7)

KernelCache (25-476)

_save_kernel_source_code_to_disk (399-404)

_save_so_cubin_to_disk (412-417)

_safe_write_file (252-259)

_load_binary (246-249)

_load_kernel_source (424-441)

_set_adapter_cache_path (443-444)

tilelang/engine/param.py (1)

KernelParam (13-141)

tilelang/jit/adapter/kernel_cache.py (1)

_save_so_cubin_to_disk (16-21)

tilelang/jit/adapter/nvrtc/kernel_cache.py (1)

_save_so_cubin_to_disk (11-18)

tilelang/jit/adapter/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (2)

_safe_write_file (252-259)

_safe_write_executable (262-265)

tilelang/cache/kernel_cache.py (1)

tilelang/jit/adapter/cutedsl/kernel_cache.py (6)

_set_adapter_cache_path (50-52)

_save_kernel_source_code_to_disk (20-21)

_save_so_cubin_to_disk (24-39)

_get_required_files (42-43)

_load_kernel_source (46-47)

_build_kernel (55-84)

🪛 Ruff (0.14.10)

tilelang/jit/adapter/cutedsl/kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

tilelang/cache/__init__.py

58-58: Avoid specifying long messages outside the exception class

(TRY003)

tilelang/cache/kernel_cache.py

443-443: Unused method argument: kernel

(ARG002)

443-443: Unused method argument: cache_path

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)

🔇 Additional comments (15)

tilelang/jit/adapter/cython/kernel_cache.py (1)

1-4: LGTM! Valid placeholder implementation.

The minimal CythonKernelCache subclass correctly inherits from KernelCache and can be extended with Cython-specific behavior as needed.

tilelang/jit/adapter/kernel_cache.py (1)

10-14: LGTM! TVM-specific wrapper save logic is correct.

The method properly saves the host source code using the TVM adapter's get_host_source() method.

tilelang/jit/adapter/cutedsl/kernel_cache.py (4)

13-17: LGTM! CuTeDSL-specific path configuration.

The custom paths for launcher library and C++ source are appropriate for the CuTeDSL backend's architecture.

19-21: LGTM! Intentional no-op for CuTeDSL.

CuTeDSL doesn't require device kernel source persistence, so the no-op implementation is correct for this backend.

23-39: LGTM! Robust launcher artifact persistence.

The method correctly handles optional launcher library and C++ source with defensive existence checks.

45-83: LGTM! CuTeDSL-specific load and build logic.

The implementation correctly:

Returns empty strings for kernel source (not needed for CuTeDSL)

Propagates cache path to the adapter

Builds kernel based solely on kernel_params (since source strings are always empty)

This aligns with CuTeDSL's architecture where the launcher library is the primary artifact.

tilelang/cache/__init__.py (2)

5-28: LGTM! Clean dispatch registry implementation.

The dispatch map architecture successfully replaces the if-else branching mentioned in the PR objectives. The TYPE_CHECKING guard and backend instantiation are well-structured.

31-58: LGTM! Dispatch logic successfully eliminates if-else branching.

The function properly routes to backend-specific implementations via the dispatch map, achieving the refactoring goal stated in the PR objectives. The default backend and error handling are appropriate.

tilelang/cache/kernel_cache.py (7)

39-42: LGTM! Class-level path attributes enable inheritance-based dispatch.

These attributes form the foundation of the inheritance-based refactoring, allowing subclasses to customize paths without modifying base class logic. This successfully addresses the PR objective of eliminating if-else branching.

219-219: LGTM! Extracted cache path setting enables subclass customization.

The helper method allows backends like CuTeDSL to customize cache path propagation while keeping the base implementation simple.

287-314: LGTM! Helper method extraction enables backend-specific behavior.

The refactored save logic successfully moves backend-specific handling to overridable methods, eliminating the need for if-else branching in the base class.

345-380: LGTM! Load logic properly delegates to customizable helpers.

The refactored loading flow uses class-level paths and overridable helper methods, allowing backends to customize required files, source loading, and kernel building without modifying the base flow.

399-404: LGTM! Device kernel source save with proper null check.

The method correctly checks for None before writing, and the override mechanism allows backends like CuTeDSL to skip this step entirely.

412-441: LGTM! I/O helper implementations are robust.

The save/load helper methods properly:

Use class-level path attributes

Handle exceptions gracefully

Return None on load failures

Enable subclass customization

This design cleanly separates concerns and eliminates backend-specific branching.

443-444: Intentional no-op base implementation.

The empty method with unused parameters is a valid extension point for subclasses. CuTeDSL overrides this to propagate the cache path to its adapter. The static analysis warning can be safely ignored in this case.

coderabbitai · 2025-12-23T07:12:55Z

tilelang/jit/adapter/nvrtc/kernel_cache.py

+    def _save_so_cubin_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
+        src_lib_path = kernel.adapter.libpath
+        kernel_py_path = os.path.join(cache_path, self.kernel_py_path)
+        src_lib_path = src_lib_path.replace(".cubin", ".py")
+        if verbose:
+            self.logger.debug(f"Saving kernel nvrtc python code to file: {kernel_py_path}")
+        KernelCache._safe_write_file(kernel_py_path, "wb", lambda file: file.write(KernelCache._load_binary(src_lib_path)))
+        super()._save_so_cubin_to_disk(kernel, cache_path, verbose)


⚠️ Potential issue | 🟡 Minor

Validate file extension assumption before string replacement.

Line 14 assumes kernel.adapter.libpath ends with .cubin, but performs a blind replace(".cubin", ".py"). If the libpath doesn't contain .cubin, this silently fails and could lead to incorrect paths.

🔎 Proposed fix to validate the extension

def _save_so_cubin_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False): src_lib_path = kernel.adapter.libpath kernel_py_path = os.path.join(cache_path, self.kernel_py_path) - src_lib_path = src_lib_path.replace(".cubin", ".py") + if not src_lib_path.endswith(".cubin"): + self.logger.warning(f"Expected .cubin extension but got: {src_lib_path}") + src_py_path = src_lib_path.replace(".cubin", ".py") if verbose: self.logger.debug(f"Saving kernel nvrtc python code to file: {kernel_py_path}") - KernelCache._safe_write_file(kernel_py_path, "wb", lambda file: file.write(KernelCache._load_binary(src_lib_path))) + KernelCache._safe_write_file(kernel_py_path, "wb", lambda file: file.write(KernelCache._load_binary(src_py_path))) super()._save_so_cubin_to_disk(kernel, cache_path, verbose)

🤖 Prompt for AI Agents

In tilelang/jit/adapter/nvrtc/kernel_cache.py around lines 11 to 18, the code blindly replaces ".cubin" with ".py" on kernel.adapter.libpath which can silently produce incorrect paths if the string doesn't contain that extension; validate the extension first (e.g., if libpath.endswith(".cubin") then replace, otherwise either compute the .py path using os.path.splitext(libpath)[0] + ".py" or log and raise a clear error) and update the call to KernelCache._load_binary to use the validated/constructed path so you don't silently operate on a wrong filename.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

tilelang/jit/adapter/kernel_cache.py (1)
16-21: Fix incorrect variable in log message.

Line 20 logs executable (the Executable object) instead of kernel_lib_path (the destination path). The log message should indicate where the file is being saved, not the object being saved.
🔎 Proposed fix
     def _save_so_cubin_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
         kernel_lib_path = os.path.join(cache_path, self.kernel_lib_path)
         executable = kernel.adapter.executable
         if verbose:
-            self.logger.debug(f"Saving kernel executable to file: {executable}")
+            self.logger.debug(f"Saving kernel executable to file: {kernel_lib_path}")
         KernelCache._safe_write_executable(executable, kernel_lib_path)
tilelang/cache/kernel_cache.py (2)
406-410: Add null check for get_kernel_source() return value.

Some adapters (e.g., CuTeDSLKernelAdapter, NVRTCKernelAdapter) can return None from get_kernel_source(). The current implementation passes the result directly to file.write(), which will raise a TypeError if None is returned. The exception gets caught but logged with a misleading generic message.
🔎 Proposed fix
     def _save_wrapper_kernel_code_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
         host_kernel_path = os.path.join(cache_path, self.host_kernel_path)
         if verbose:
             self.logger.debug(f"Saving wrapped kernel source code to file: {host_kernel_path}")
-        KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(kernel.adapter.get_kernel_source()))
+        source = kernel.adapter.get_kernel_source()
+        if source is None:
+            self.logger.debug("Skipping host kernel source save: adapter returned None")
+            return
+        KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(source))
460-476: Use explicit None checks instead of truthiness for kernel source validation.

The condition if host_kernel_source and device_kernel_source and kernel_params: uses truthiness checks that conflate two distinct cases:

None: Source file not found (error)

"": Source file exists but is intentionally empty (valid)

The CuTeDSLKernelCache works around this by overriding _build_kernel to only check kernel_params, but other backends that inherit this base implementation could fail if they need to return empty source strings.
🔎 Proposed fix
     def _build_kernel(
         self,
         func: Callable | None,
         host_kernel_source: str,
         device_kernel_source: str,
         kernel_lib_path: str,
         kernel_params: list[KernelParam] | None,
         target: str | Target,
         target_host: str | Target | None,
         out_idx: list[int] | None,
         execution_backend: Literal["tvm_ffi", "cython", "nvrtc", "torch", "cutedsl"],
         pass_configs: dict | None,
         compile_flags: list[str] | str | None,
     ) -> JITKernel | None:
-        if host_kernel_source and device_kernel_source and kernel_params:
+        if host_kernel_source is not None and device_kernel_source is not None and kernel_params is not None:
             return JITKernel.from_database(

🧹 Nitpick comments (1)

tilelang/jit/adapter/cutedsl/kernel_cache.py (1)
41-43: Consider using iterable unpacking for list concatenation.

Static analysis suggests replacing list concatenation with iterable unpacking for slightly better performance and idiomatic Python.
🔎 Proposed fix
     @override
     def _get_required_files(self, cache_path: str) -> list[str]:
-        return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+        return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aba3d29 and 96bf90c.

📒 Files selected for processing (7)

tilelang/cache/__init__.py
tilelang/cache/kernel_cache.py
tilelang/jit/adapter/cutedsl/kernel_cache.py
tilelang/jit/adapter/cython/kernel_cache.py
tilelang/jit/adapter/kernel_cache.py
tilelang/jit/adapter/nvrtc/kernel_cache.py
tilelang/jit/adapter/torch/kernel_cache.py

🚧 Files skipped from review as they are similar to previous changes (3)

tilelang/jit/adapter/cython/kernel_cache.py
tilelang/jit/adapter/torch/kernel_cache.py
tilelang/jit/adapter/nvrtc/kernel_cache.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

tilelang/cache/__init__.py

🧬 Code graph analysis (1)

tilelang/cache/__init__.py (6)

tilelang/jit/adapter/cutedsl/kernel_cache.py (1)

CuTeDSLKernelCache (13-84)

tilelang/jit/adapter/cython/kernel_cache.py (1)

CythonKernelCache (4-4)

tilelang/jit/adapter/nvrtc/kernel_cache.py (1)

NVRTCKernelCache (7-18)

tilelang/jit/adapter/torch/kernel_cache.py (1)

TorchKernelCache (4-4)

tilelang/jit/adapter/kernel_cache.py (1)

TVMFFIKernelCache (7-21)

tilelang/cache/kernel_cache.py (2)

KernelCache (25-476)

cached (110-223)

🪛 Ruff (0.14.10)

tilelang/cache/__init__.py

56-56: Avoid specifying long messages outside the exception class

(TRY003)

tilelang/cache/kernel_cache.py

443-443: Unused method argument: kernel

(ARG002)

443-443: Unused method argument: cache_path

(ARG002)

tilelang/jit/adapter/cutedsl/kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)

🔇 Additional comments (12)

tilelang/jit/adapter/kernel_cache.py (1)

7-14: LGTM!

The TVMFFIKernelCache class correctly inherits from KernelCache and overrides the appropriate methods for TVM FFI-specific behavior. The host_kernel_path attribute is properly inherited from the base class.

tilelang/jit/adapter/cutedsl/kernel_cache.py (3)

23-39: LGTM!

The _save_so_cubin_to_disk override correctly handles CuTeDSL-specific launcher library and C++ source saving with proper defensive checks.

54-84: LGTM!

The _build_kernel override correctly checks only kernel_params since CuTeDSL's _load_kernel_source intentionally returns empty strings. This properly handles the CuTeDSL-specific loading behavior.

49-52: LGTM!

The _set_adapter_cache_path override correctly propagates the cache path to the CuTeDSL adapter.
tilelang/cache/__init__.py (3)
19-26: LGTM!

The dispatch map correctly maps execution backends to their respective KernelCache subclass instances, replacing the previous singleton pattern with a clean inheritance-based dispatch mechanism.

29-56: LGTM!

The cached() function correctly dispatches to the appropriate backend cache via the dispatch map. The default execution_backend="tvm_ffi" properly addresses previous review feedback about the None default causing errors.

74-75: Module-level clear_cache() call will crash on import when env var is set.

When TILELANG_CLEAR_CACHE is set to a truthy value, calling clear_cache() at module import time will raise a RuntimeError, crashing the application. This may be the intended behavior to force users to acknowledge the risk, but it could be unexpected for users setting this environment variable.

Consider whether this is the intended UX. If users are expected to manually delete the cache, you might want to just log a warning instead of raising an exception at import time:
if env.TILELANG_CLEAR_CACHE.lower() in ("1", "true", "yes", "on"):
    import warnings
    warnings.warn(
        f"TILELANG_CLEAR_CACHE is set but automatic cache clearing is disabled. "
        f"To clear the cache, manually run: rm -rf '{env.TILELANG_CACHE_DIR}'"
    )
tilelang/cache/kernel_cache.py (5)

39-42: LGTM!

Moving path constants to class-level attributes is a clean refactor that enables backend-specific subclasses to override paths as needed.

443-444: No-op hook method is appropriate for inheritance pattern.

The empty _set_adapter_cache_path method serves as a hook that subclasses (like CuTeDSLKernelCache) can override. The static analysis warnings about unused arguments are false positives for this design pattern.

424-441: LGTM!

The _load_kernel_source method correctly handles file loading with proper exception handling and logging. It returns None for failed loads, allowing callers to handle missing sources appropriately.

419-422: LGTM!

The _get_required_files method correctly returns only the essential files (kernel library and parameters). Source files are optional, and subclasses can extend this list as needed.

399-404: LGTM!

The _save_kernel_source_code_to_disk method correctly checks for None before writing, avoiding potential TypeError exceptions.

silentCoder-dev

LGTM.

tilelang/cache/__init__.py

lucifer1004 · 2025-12-24T03:31:48Z

Could you also update CI tests to cover kernel cache?

sgjzfzzf · 2025-12-24T03:37:38Z

Could you also update CI tests to cover kernel cache?

Sorry I'm not familiar with this CI😭. Which part should I start with? In .github/workflow.yaml?

lucifer1004 · 2025-12-24T03:39:22Z

You should modify https://github.com/tile-ai/tilelang/blob/main/.github/workflows/ci.yml

lucifer1004 · 2025-12-24T04:26:58Z

I think running https://github.com/tile-ai/tilelang/tree/main/testing/python/jit twice should be enough.
There are some caveats, though:

For postproc tests, cache needs to be disabled explicitly since they will fail if run twice.
For cutedsl, a separate run is required since it relies on env var TILELANG_USE_GEMM_V1=1.

SiriusNEO

Looks good to me. After adding the test we are good to go

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

tilelang/jit/adapter/kernel_cache.py (1)
16-21: Fix incorrect variable in log message.

Line 20 logs executable (the Executable object) instead of kernel_lib_path (the destination path). The log message should indicate where the file is being saved.
🔎 Proposed fix
     def _save_so_cubin_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
         kernel_lib_path = os.path.join(cache_path, self.kernel_lib_path)
         executable = kernel.adapter.executable
         if verbose:
-            self.logger.debug(f"Saving kernel executable to file: {executable}")
+            self.logger.debug(f"Saving kernel executable to file: {kernel_lib_path}")
         KernelCache._safe_write_executable(executable, kernel_lib_path)
tilelang/cache/kernel_cache.py (2)
429-433: Add null check for get_kernel_source() return value.

Some adapters (e.g., CuTeDSLKernelAdapter, NVRTCKernelAdapter) can return None from get_kernel_source(). Passing None to file.write() will cause a TypeError that gets caught with a misleading generic message.
🔎 Proposed fix
 def _save_wrapper_kernel_code_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
     host_kernel_path = os.path.join(cache_path, self.host_kernel_path)
     if verbose:
         self.logger.debug(f"Saving wrapped kernel source code to file: {host_kernel_path}")
-    KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(kernel.adapter.get_kernel_source()))
+    source = kernel.adapter.get_kernel_source()
+    if source is None:
+        self.logger.debug("Skipping host kernel source save: adapter returned None")
+        return
+    KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(source))
483-499: Use explicit None checks instead of truthiness for kernel source validation.

The condition if host_kernel_source and device_kernel_source and kernel_params: uses truthiness checks that conflate None (source not found) with "" (source exists but is empty). The CuTeDSLKernelCache._load_kernel_source intentionally returns "", "" which would fail this check.
🔎 Proposed fix
-        if host_kernel_source and device_kernel_source and kernel_params:
+        if host_kernel_source is not None and device_kernel_source is not None and kernel_params is not None:
             return JITKernel.from_database(

🧹 Nitpick comments (1)

tilelang/jit/adapter/cutedsl/kernel_cache.py (1)
41-43: Consider using iterable unpacking instead of concatenation.

Per Ruff RUF005, iterable unpacking is slightly more idiomatic:
🔎 Proposed fix
     @override
     def _get_required_files(self, cache_path: str) -> list[str]:
-        return super()._get_required_files(cache_path) + [os.path.join(cache_path, self.launcher_lib_path)]
+        return [*super()._get_required_files(cache_path), os.path.join(cache_path, self.launcher_lib_path)]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96bf90c and 6195770.

📒 Files selected for processing (8)

.github/workflows/ci.yml
tilelang/cache/__init__.py
tilelang/cache/kernel_cache.py
tilelang/jit/adapter/cutedsl/kernel_cache.py
tilelang/jit/adapter/cython/kernel_cache.py
tilelang/jit/adapter/kernel_cache.py
tilelang/jit/adapter/nvrtc/kernel_cache.py
tilelang/jit/adapter/torch/kernel_cache.py

🚧 Files skipped from review as they are similar to previous changes (3)

tilelang/jit/adapter/torch/kernel_cache.py
tilelang/jit/adapter/cython/kernel_cache.py
tilelang/jit/adapter/nvrtc/kernel_cache.py

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-10-10T13:29:29.347Z

Learnt from: XuehaiPan
Repo: tile-ai/tilelang PR: 973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

.github/workflows/ci.yml

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

.github/workflows/ci.yml
tilelang/cache/__init__.py

🧬 Code graph analysis (2)

tilelang/jit/adapter/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (4)

KernelCache (25-499)

_safe_write_file (275-282)

_save_so_cubin_to_disk (435-440)

_safe_write_executable (285-288)

tilelang/jit/adapter/cutedsl/kernel_cache.py (1)

tilelang/cache/kernel_cache.py (6)

KernelCache (25-499)

_save_so_cubin_to_disk (435-440)

_safe_write_file (275-282)

_load_binary (269-272)

_get_required_files (442-445)

_load_kernel_source (447-464)

🪛 Ruff (0.14.10)

tilelang/cache/kernel_cache.py

466-466: Unused method argument: kernel

(ARG002)

466-466: Unused method argument: cache_path

(ARG002)

tilelang/jit/adapter/cutedsl/kernel_cache.py

43-43: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)

tilelang/cache/__init__.py

56-56: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

🔇 Additional comments (10)

.github/workflows/ci.yml (1)

376-404: LGTM! CI steps align with the requested two-pass JIT testing strategy.

The implementation correctly:

Runs JIT tests twice to catch cache-related regressions (per lucifer1004's suggestion)

Disables cache on pass 2 via TILELANG_DISABLE_CACHE=1 to avoid postproc conflicts

Keeps CuTeDSL tests separate (run with TILELANG_USE_GEMM_V1=1 in a dedicated step)

tilelang/jit/adapter/kernel_cache.py (1)

7-14: LGTM!

The TVMFFIKernelCache class correctly overrides the kernel_lib_path class attribute and provides a specialized implementation for saving the wrapper kernel source.

tilelang/cache/__init__.py (2)

19-26: LGTM! Dispatch map correctly routes to backend-specific caches.

The singleton instances for each backend are properly instantiated, and the naming follows the suggestion from previous reviews (_dispatch_map instead of _dispatch_pool).

43-56: LGTM!

The dispatch logic correctly routes to the appropriate backend cache and raises a clear error for unsupported backends. Past review concerns (missing "cutedsl", None default causing errors) have been addressed.

tilelang/jit/adapter/cutedsl/kernel_cache.py (3)

13-21: LGTM!

The class correctly defines CuTeDSL-specific paths and intentionally makes _save_kernel_source_code_to_disk a no-op since CuTeDSL uses a different code generation path.

54-84: LGTM!

The _build_kernel implementation correctly uses JITKernel.from_database when kernel_params exist and returns None otherwise, aligning with the inheritance-based design.

49-52: The _cache_path attribute is already handled defensively in the adapter.

The adapter code uses getattr(self, "_cache_path", None) with a default value, properly handling the case where the attribute may not be set. No additional defensive checks are needed in kernel_cache.py. The interface is implicitly documented via the method's usage context.

tilelang/cache/kernel_cache.py (3)

39-42: LGTM! Class-level path attributes enable proper inheritance.

Moving path constants to class attributes allows subclasses to override them (e.g., TVMFFIKernelCache.kernel_lib_path = "executable.so"), which is the core of the inheritance-based refactoring.

466-467: Intentional no-op hook for subclass override.

The unused arguments (kernel, cache_path) are expected for this hook method pattern. Subclasses like CuTeDSLKernelCache override this to propagate the cache path to adapters. The Ruff ARG002 warnings can be safely ignored.

310-328: LGTM! Clean delegation to helper methods.

The refactored _save_kernel_to_disk now properly delegates to specialized helper methods (_save_kernel_source_code_to_disk, _save_wrapper_kernel_code_to_disk, _save_so_cubin_to_disk), enabling subclasses to override individual behaviors.

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tilelang/cache/kernel_cache.py (1)

125-148: Docstring is outdated after refactoring.

The docstring claims "This is the ONLY place where environment variable processing...should happen" (line 125) and documents environment variable behavior (lines 140-148), but the actual env variable processing now occurs in tilelang/cache/__init__.py:cached() (lines 44-51). This method receives already-resolved parameters.

Update the docstring to reflect that this is an internal method receiving normalized parameters, and remove the environment variable documentation section.

🔎 Proposed docstring update

     def cached(
         self,
         func: PrimFunc = None,
         out_idx: list[int] = None,
         *args,
         target: str | Target,
         target_host: str | Target | None = None,
         execution_backend: Literal["tvm_ffi", "cython", "nvrtc", "torch", "cutedsl"] = "tvm_ffi",
         verbose: bool,
         pass_configs: dict | None = None,
         compile_flags: list[str] | str | None = None,
     ) -> JITKernel:
         """
-        Caches and reuses compiled kernels to avoid redundant compilation.
-
-        This is the ONLY place where environment variable processing, target normalization,
-        and execution backend resolution should happen. All compilation paths go through here.
+        Internal caching method for backend-specific KernelCache subclasses.
+        
+        This method receives already-normalized parameters from the public 
+        tilelang.cache.cached() API. Environment variable defaults and backend 
+        resolution have already been applied.
 
         Args:
             func: Function to be compiled or a prepared PrimFunc
             out_idx: Indices specifying which outputs to return
-            target: Compilation target platform (None = read from TILELANG_TARGET env var)
+            target: Normalized compilation target platform (required)
             target_host: Host target platform
-            execution_backend: Execution backend (None = read from TILELANG_EXECUTION_BACKEND)
-            verbose: Enable verbose output (None = read from TILELANG_VERBOSE)
+            execution_backend: Resolved execution backend (required)
+            verbose: Enable verbose output (required)
             *args: Arguments passed to func
 
         Returns:
             JITKernel: The compiled kernel, either freshly compiled or from cache
-
-        Environment Variables
-        ---------------------
-        TILELANG_TARGET : str
-            Default compilation target (e.g., "cuda", "llvm"). Defaults to "auto".
-        TILELANG_EXECUTION_BACKEND : str
-            Default execution backend. Defaults to "auto".
-        TILELANG_VERBOSE : str
-            Set to "1", "true", "yes", or "on" to enable verbose compilation by default.
         """

♻️ Duplicate comments (2)

tilelang/cache/kernel_cache.py (2)
402-406: Add None guard for get_kernel_source() return value.

According to past review analysis, CuTeDSLKernelAdapter and NVRTCKernelAdapter can return None from get_kernel_source(). Line 406 directly passes this to file.write(), which will raise TypeError: a bytes-like object is required, not 'NoneType' if the adapter returns None. The exception will be caught generically with a misleading message.

Add an explicit check:
🔎 Proposed fix
 def _save_wrapper_kernel_code_to_disk(self, kernel: JITKernel, cache_path: str, verbose: bool = False):
     host_kernel_path = os.path.join(cache_path, self.host_kernel_path)
+    source = kernel.adapter.get_kernel_source()
+    if source is None:
+        if verbose:
+            self.logger.debug(f"Skipping host kernel source save: adapter returned None")
+        return
     if verbose:
         self.logger.debug(f"Saving wrapped kernel source code to file: {host_kernel_path}")
-    KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(kernel.adapter.get_kernel_source()))
+    KernelCache._safe_write_file(host_kernel_path, "w", lambda file: file.write(source))
Based on past review comments, this issue was flagged but not yet addressed.

442-472: Use explicit None checks instead of truthiness for source validation.

Line 456 uses if host_kernel_source and device_kernel_source and kernel_params: which conflates two distinct cases:

None: file not loaded (error condition)

"": file loaded but empty (valid, intentional)

According to past review analysis, the CuteDSL adapter intentionally returns "", "" from _load_kernel_source and must override _build_kernel to work around this truthiness check. Empty source strings are valid; only None should indicate a loading failure.
🔎 Proposed fix
-    if host_kernel_source and device_kernel_source and kernel_params:
+    if host_kernel_source is not None and device_kernel_source is not None and kernel_params is not None:
         return JITKernel.from_database(
This clearly differentiates "not loaded" (None) from "loaded as empty" ("") for each value.
Based on past review comments, this issue was flagged but not yet addressed.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6195770 and 631861c.

📒 Files selected for processing (2)

tilelang/cache/__init__.py
tilelang/cache/kernel_cache.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-11-14T07:56:11.098Z

Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

tilelang/cache/__init__.py

🪛 Ruff (0.14.10)

tilelang/cache/kernel_cache.py

439-439: Unused method argument: kernel

(ARG002)

439-439: Unused method argument: cache_path

(ARG002)

tilelang/cache/__init__.py

86-86: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)

🔇 Additional comments (8)

tilelang/cache/__init__.py (4)

5-18: LGTM: Clean import structure with proper circular import handling.

The use of TYPE_CHECKING for the base KernelCache import avoids circular dependencies while maintaining type hint accuracy. All backend-specific subclasses are imported and will be used in the dispatch map.

20-27: LGTM: Dispatch map correctly implements backend routing.

The _dispatch_map provides clean separation of backend-specific caching logic. Each backend has its own singleton instance, which is appropriate for the cache pattern.

44-72: Well-structured environment variable handling and backend resolution.

The centralized environment variable processing at lines 44-51 correctly establishes a single source of truth. The target normalization and backend resolution flow (lines 53-59) properly handles the "auto" case via resolve_execution_backend. The conditional logging (lines 60-72) provides useful diagnostics without noise when the user explicitly specified a backend.

73-86: Dispatch logic correctly routes to backend-specific cache implementations.

The dispatch check (line 73) and delegation (lines 74-84) properly route kernel caching to the appropriate backend. The error handling (lines 85-86) provides a clear message when an unsupported backend is requested.

tilelang/cache/kernel_cache.py (4)

39-42: LGTM: Instance path attributes enable backend-specific customization.

The introduction of instance-level path attributes (device_kernel_path, host_kernel_path, kernel_lib_path, params_path) provides a clean extension point for backend-specific subclasses to override file naming conventions (e.g., .cubin for NVRTC vs. .so for TVM FFI).

263-310: LGTM: Clean delegation to extensible helper methods.

The refactoring of _save_kernel_to_disk to delegate to _save_kernel_source_code_to_disk, _save_wrapper_kernel_code_to_disk, and _save_so_cubin_to_disk provides clear extension points for backend-specific subclasses while maintaining proper error handling.

312-376: LGTM: Well-structured load flow with clear separation of concerns.

The refactored _load_kernel_from_disk cleanly separates:

Required file validation via _get_required_files() (line 346)

Optional source loading via _load_kernel_source() (line 352)

Parameter deserialization (lines 354-362)

Kernel reconstruction via _build_kernel() (lines 364-376)

This provides clear extension points for subclasses to customize each phase.

439-441: Static analysis false positive: unused arguments are intentional.

The static analysis warning about unused kernel and cache_path parameters is a false positive. This is a base implementation designed to be overridden by subclasses (e.g., NVRTCKernelCache may set adapter state using these parameters). The parameters must remain for interface consistency.

coderabbitai bot reviewed Dec 20, 2025

View reviewed changes

tilelang/jit/adapter/cutedsl/kernel_cache.py Show resolved Hide resolved

tilelang/jit/adapter/nvrtc/kernel_cache.py Show resolved Hide resolved

tilelang/jit/adapter/kernel_cache.py Show resolved Hide resolved

coderabbitai bot reviewed Dec 20, 2025

View reviewed changes

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Dec 20, 2025

View reviewed changes

LeiWang1999 requested review from LeiWang1999 and silentCoder-dev December 20, 2025 16:10

LeiWang1999 reviewed Dec 20, 2025

View reviewed changes

examples/gemm/example_gemm.py Outdated Show resolved Hide resolved

silentCoder-dev reviewed Dec 21, 2025

View reviewed changes

tilelang/cache/auto_kernel_cache.py Outdated Show resolved Hide resolved

silentCoder-dev reviewed Dec 21, 2025

View reviewed changes

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Dec 21, 2025

View reviewed changes

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

sgjzfzzf requested review from LeiWang1999 and silentCoder-dev December 21, 2025 16:06

silentCoder-dev reviewed Dec 22, 2025

View reviewed changes

tilelang/jit/adapter/nvrtc/kernel_cache.py Show resolved Hide resolved

silentCoder-dev reviewed Dec 22, 2025

View reviewed changes

tilelang/jit/adapter/nvrtc/kernel_cache.py Show resolved Hide resolved

silentCoder-dev reviewed Dec 22, 2025

View reviewed changes

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Dec 22, 2025

View reviewed changes

SiriusNEO self-requested a review December 22, 2025 04:47

SiriusNEO requested changes Dec 22, 2025

View reviewed changes

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

tilelang/cache/kernel_cache.py Show resolved Hide resolved

tilelang/cache/kernel_cache.py Outdated Show resolved Hide resolved

SiriusNEO changed the title ~~[Enhancement] refactor kernel_cache~~ [Enhancement] Refactor KernelCache to use inheritance-based design Dec 22, 2025

sgjzfzzf force-pushed the kernel_cache branch from 2e3be15 to 1c0c714 Compare December 22, 2025 19:04

coderabbitai bot reviewed Dec 22, 2025

View reviewed changes

sgjzfzzf requested review from SiriusNEO and silentCoder-dev December 23, 2025 02:34

silentCoder-dev reviewed Dec 23, 2025

View reviewed changes

tilelang/cache/kernel_cache.py Outdated Show resolved Hide resolved

sgjzfzzf force-pushed the kernel_cache branch from be091a6 to aba3d29 Compare December 23, 2025 07:07

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

sgjzfzzf force-pushed the kernel_cache branch from aba3d29 to 96bf90c Compare December 23, 2025 12:56

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

sgjzfzzf requested review from LeiWang1999, SiriusNEO and silentCoder-dev December 23, 2025 16:03

silentCoder-dev reviewed Dec 24, 2025

View reviewed changes

lucifer1004 reviewed Dec 24, 2025

View reviewed changes

tilelang/cache/__init__.py Outdated Show resolved Hide resolved

SiriusNEO previously approved these changes Dec 24, 2025

View reviewed changes

sgjzfzzf added 11 commits December 25, 2025 01:16

refactor kernel_cache

906b4d0

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

set key to auto when execution_backend is None

69ae303

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

fix type error on _load_kernel_source override

48f7d4f

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

put kernel cache subclasses in adapter

7473a7c

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

remove auto in execution_backend

31be413

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

remove _save_kernel_source_code_to_disk for NVRTCKernelCache

7aeac5f

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

refactor all cutedsl if statements

14af6c5

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

remove unused kernel_cubin_path

a808a87

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

add func arguments

e3da822

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

remove ctypes

4d263c5

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

update ci.yaml to test kernel cache

6195770

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

sgjzfzzf dismissed SiriusNEO’s stale review via 6195770 December 24, 2025 17:54

sgjzfzzf force-pushed the kernel_cache branch from 96bf90c to 6195770 Compare December 24, 2025 17:54

coderabbitai bot reviewed Dec 24, 2025

View reviewed changes

fix bugs after tile-ai#1512 merged

631861c

Signed-off-by: Jinjie Liu <jjliu@baai.ac.cn>

coderabbitai bot reviewed Dec 24, 2025

View reviewed changes

[Enhancement] Refactor KernelCache to use inheritance-based design #1483

Are you sure you want to change the base?

[Enhancement] Refactor KernelCache to use inheritance-based design #1483

Conversation

sgjzfzzf commented Dec 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Dec 20, 2025

Uh oh!

sgjzfzzf commented Dec 20, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

LeiWang1999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sgjzfzzf commented Dec 20, 2025

Uh oh!

Uh oh!

Uh oh!

sgjzfzzf commented Dec 21, 2025

Uh oh!

silentCoder-dev commented Dec 21, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

SiriusNEO left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

sgjzfzzf commented Dec 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 20, 2025 •

edited

Loading