Skip to content

[Enhancement] Reduce CI Test Time #1492

@LeiWang1999

Description

@LeiWang1999

We can optimize the test cases by reducing test shapes and optimizing the critical path to shorten CI time.

Example CI Test Time:

=========================================================== fixture duration top ============================================================
total          name                                                                                         num med            min           
       0:00:00 grand total                                                                                    0        0:00:00        0:00:00
========================================================== test call duration top ===========================================================
total          name                                                                                         num med            min           
0:02:29.488783 gemm_sp/test_example_gemm_sp.py::test_example_gemm_sp                                          1 0:02:29.488783 0:02:29.488783
0:01:45.305204 sparse_tensorcore/test_example_sparse_tensorcore.py::test_tilelang_example_sparse_tensorcore   1 0:01:45.305204 0:01:45.305204
0:01:18.967367 dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_gemm_fp4_hopper          1 0:01:18.967367 0:01:18.967367
0:01:00.995997 gemm/test_example_gemm.py::test_example_gemm_schedule                                          1 0:01:00.995997 0:01:00.995997
0:00:28.190835 attention_sink/test_example_attention_sink.py::test_example_gqa_sink_bwd_bhsd                  1 0:00:28.190835 0:00:28.190835
0:00:26.493459 deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_sparse_mla_bwd                1 0:00:26.493459 0:00:26.493459
0:00:25.231914 attention_sink/test_example_attention_sink.py::test_example_mha_sink_bwd_bhsd                  1 0:00:25.231914 0:00:25.231914
0:00:24.066198 flash_attention/test_example_flash_attention.py::test_example_gqa_bwd_tma_reduce_varlen        1 0:00:24.066198 0:00:24.066198
0:00:23.609148 flash_attention/test_example_flash_attention.py::test_example_gqa_bwd                          1 0:00:23.609148 0:00:23.609148
0:00:23.046067 flash_attention/test_example_flash_attention.py::test_example_mha_bwd                          1 0:00:23.046067 0:00:23.046067
0:00:22.868651 gemv/test_example_gemv.py::test_example_gemv                                                   1 0:00:22.868651 0:00:22.868651
0:00:22.039937 flash_attention/test_example_flash_attention.py::test_example_mha_bwd_bhsd                     1 0:00:22.039937 0:00:22.039937
0:00:20.711386 flash_attention/test_example_flash_attention.py::test_example_gqa_bwd_wgmma_pipelined          1 0:00:20.711386 0:00:20.711386
0:00:18.213203 flash_attention/test_example_flash_attention.py::test_example_mha_bwd_wgmma_pipelined          1 0:00:18.213203 0:00:18.213203
0:00:16.536672 minference/test_vs_sparse_attn.py::test_vs_sparse_attn                                         1 0:00:16.536672 0:00:16.536672
0:00:16.185672 gdn/test_example_gdn_compilation.py::test_example_wy_fast_bwd_split_compilation                1 0:00:16.185672 0:00:16.185672
0:00:15.578780 elementwise/test_example_elementwise.py::test_example_elementwise_add_autotune                 1 0:00:15.578780 0:00:15.578780
0:00:15.572629 attention_sink/test_example_attention_sink.py::test_example_gqa_sink_bwd_bhsd_sliding_window   1 0:00:15.572629 0:00:15.572629
0:00:15.499994 linear_attention/test_linear_attn.py::test_example_linear_attn_bwd                             1 0:00:15.499994 0:00:15.499994
0:00:14.530373 attention_sink/test_example_attention_sink.py::test_example_mha_sink_bwd_bhsd_sliding_window   1 0:00:14.530373 0:00:14.530373
0:00:13.514484 fusedmoe/test_example_fusedmoe.py::test_example_fusedmoe_tilelang                              1 0:00:13.514484 0:00:13.514484
0:00:13.412966 gemm/test_example_gemm.py::test_example_gemm_autotune                                          1 0:00:13.412966 0:00:13.412966
0:00:13.003403 gdn/test_example_gdn_compilation.py::test_example_chunk_o_bwd_compilation                      1 0:00:13.003403 0:00:13.003403
0:00:11.722232 gemm_fp8/test_example_gemm_fp8.py::test_example_tilelang_gemm_fp8_intrinsic                    1 0:00:11.722232 0:00:11.722232
0:00:11.672637 linear_attention/test_linear_attn.py::test_example_linear_attn_fwd                             1 0:00:11.672637 0:00:11.672637
0:00:11.554361 seer_attention/test_block_sparse_attn_tilelang.py::test_block_sparse_attn_tilelang             1 0:00:11.554361 0:00:11.554361
0:00:11.096757 gemm_sp/test_example_gemm_sp.py::test_example_custom_compress                                  1 0:00:11.096757 0:00:11.096757
0:00:10.975390 deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_sparse_mla_fwd_pipelined      1 0:00:10.975390 0:00:10.975390
0:00:10.391669 deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_fp8_lighting_indexer          1 0:00:10.391669 0:00:10.391669
0:00:10.137347 gemm_fp8/test_example_gemm_fp8.py::test_example_tilelang_gemm_fp8                              1 0:00:10.137347 0:00:10.137347
0:19:03.438066 grand total                                                                                   84 0:00:07.041250 0:00:00.025089
========================================================== test setup duration top ==========================================================
total          name                                                                                         num med            min           
0:00:00.009047 grand total                                                                                   88 0:00:00.000092 0:00:00.000063
======================================================== test teardown duration top =========================================================
total          name                                                                                         num med            min           
0:00:00.009374 grand total                                                                                   88 0:00:00.000099 0:00:00.000054
======================================== slowest durations =========================================
149.49s call     gemm_sp/test_example_gemm_sp.py::test_example_gemm_sp
105.31s call     sparse_tensorcore/test_example_sparse_tensorcore.py::test_tilelang_example_sparse_tensorcore
78.97s call     dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_gemm_fp4_hopper
61.00s call     gemm/test_example_gemm.py::test_example_gemm_schedule
28.19s call     attention_sink/test_example_attention_sink.py::test_example_gqa_sink_bwd_bhsd
26.49s call     deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_sparse_mla_bwd
25.23s call     attention_sink/test_example_attention_sink.py::test_example_mha_sink_bwd_bhsd
24.07s call     flash_attention/test_example_flash_attention.py::test_example_gqa_bwd_tma_reduce_varlen
23.61s call     flash_attention/test_example_flash_attention.py::test_example_gqa_bwd
23.05s call     flash_attention/test_example_flash_attention.py::test_example_mha_bwd
22.87s call     gemv/test_example_gemv.py::test_example_gemv
22.04s call     flash_attention/test_example_flash_attention.py::test_example_mha_bwd_bhsd
20.71s call     flash_attention/test_example_flash_attention.py::test_example_gqa_bwd_wgmma_pipelined
18.21s call     flash_attention/test_example_flash_attention.py::test_example_mha_bwd_wgmma_pipelined
16.54s call     minference/test_vs_sparse_attn.py::test_vs_sparse_attn
16.19s call     gdn/test_example_gdn_compilation.py::test_example_wy_fast_bwd_split_compilation
15.58s call     elementwise/test_example_elementwise.py::test_example_elementwise_add_autotune
15.57s call     attention_sink/test_example_attention_sink.py::test_example_gqa_sink_bwd_bhsd_sliding_window
15.50s call     linear_attention/test_linear_attn.py::test_example_linear_attn_bwd
14.53s call     attention_sink/test_example_attention_sink.py::test_example_mha_sink_bwd_bhsd_sliding_window
13.51s call     fusedmoe/test_example_fusedmoe.py::test_example_fusedmoe_tilelang
13.41s call     gemm/test_example_gemm.py::test_example_gemm_autotune
13.00s call     gdn/test_example_gdn_compilation.py::test_example_chunk_o_bwd_compilation
11.72s call     gemm_fp8/test_example_gemm_fp8.py::test_example_tilelang_gemm_fp8_intrinsic
11.67s call     linear_attention/test_linear_attn.py::test_example_linear_attn_fwd
11.55s call     seer_attention/test_block_sparse_attn_tilelang.py::test_block_sparse_attn_tilelang
11.10s call     gemm_sp/test_example_gemm_sp.py::test_example_custom_compress
10.98s call     deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_sparse_mla_fwd_pipelined
10.39s call     deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_fp8_lighting_indexer
10.14s call     gemm_fp8/test_example_gemm_fp8.py::test_example_tilelang_gemm_fp8
9.24s call     gemm_fp8/test_example_gemm_fp8.py::test_example_tilelang_gemm_fp8_2xAcc
8.81s call     deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_sparse_mla_fwd
7.91s call     flash_attention/test_example_flash_attention.py::test_example_mha_fwd_varlen
7.90s call     flash_decoding/test_example_flash_decoding.py::test_example_example_mha_inference
7.87s call     blocksparse_attention/test_example_blocksparse_attention.py::test_example_tilelang_sparse_gqa_decode_varlen_mask
7.60s call     deepseek_nsa/test_example_tilelang_nsa.py::test_example_tilelang_nsa_fwd
7.55s call     attention_sink/test_example_attention_sink.py::test_example_mha_sink_fwd_bhsd_sliding_window
7.37s call     dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_gemm_bf16_mxfp4_hopper_tma
7.35s call     dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_gemm_w4a8
7.29s call     attention_sink/test_example_attention_sink.py::test_example_mha_sink_fwd_bhsd_full_attn
7.19s call     gdn/test_example_gdn_compilation.py::test_example_chunk_delta_h_compilation
7.16s call     attention_sink/test_example_attention_sink.py::test_example_mha_sink_fwd_bhsd_wgmma_pipelined_sliding_window
6.93s call     gdn/test_example_gdn_compilation.py::test_example_chunk_delta_bwd_compilation
6.87s call     attention_sink/test_example_attention_sink.py::test_example_gqa_sink_fwd_bhsd_wgmma_pipelined_sliding_window
6.78s call     blocksparse_attention/test_example_blocksparse_attention.py::test_example_tilelang_sparse_gqa_decode_varlen_indice
6.73s call     deepseek_mla/test_example_mla_decode.py::test_example_mla_decode
6.70s call     attention_sink/test_example_attention_sink.py::test_example_gqa_sink_fwd_bhsd_wgmma_pipelined_full_attn
6.65s call     attention_sink/test_example_attention_sink.py::test_example_mha_sink_fwd_bhsd_wgmma_pipelined_full_attn
6.48s call     gdn/test_example_gdn_compilation.py::test_example_chunk_o_compilation
6.39s call     blocksparse_attention/test_example_blocksparse_attention.py::test_example_tilelang_block_sparse_attn
6.32s call     flash_attention/test_example_flash_attention.py::test_example_gqa_fwd_bshd_wgmma_pipelined
6.14s call     dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_groupedgemm_bf16_mxfp4_hopper
6.07s call     deepseek_v32/test_tilelang_example_deepseek_v32.py::test_example_topk_selector
5.93s call     flash_attention/test_example_flash_attention.py::test_example_mha_fwd_bhsd_wgmma_pipelined
5.78s call     flash_attention/test_example_flash_attention.py::test_example_mha_fwd_bshd
5.64s call     gdn/test_example_gdn_compilation.py::test_example_wy_fast_compilation
5.62s call     flash_attention/test_example_flash_attention.py::test_example_mha_fwd_bshd_wgmma_pipelined
5.54s call     flash_attention/test_example_flash_attention.py::test_example_gqa_fwd_bshd
5.47s call     cast/test_example_cast.py::test_example_group_per_split_token_cast_to_fp8
5.44s call     flash_attention/test_example_flash_attention.py::test_example_mha_fwd_bhsd
5.32s call     deepseek_nsa/test_example_tilelang_nsa.py::test_example_tilelang_nsa_fwd_decode
5.26s call     dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_gemm_bf16_mxfp4_hopper
5.08s call     warp_specialize/test_example_warp_specialize.py::test_example_warp_specialize_gemm_copy_1_gemm_0
5.07s call     gemm_splitk/test_example_gemm_splitk.py::test_example_tilelang_gemm_splitk
5.06s call     deepseek_deepgemm/test_example_deepgemm_fp8_2xAcc.py::test_deepgemm_fp8_2xAcc
5.02s call     gdn/test_example_gdn_compilation.py::test_example_chunk_scaled_dot_kkt_compilation
5.01s call     gemm/test_example_gemm.py::test_example_gemm
4.98s call     warp_specialize/test_example_warp_specialize.py::test_example_warp_specialize_gemm_barrierpipe_stage2
4.82s call     gemm_splitk/test_example_gemm_splitk.py::test_example_tilelang_gemm_splitk_vectorize_atomicadd
4.72s call     cast/test_example_cast.py::test_example_per_token_cast_to_fp8
4.71s call     warp_specialize/test_example_warp_specialize.py::test_example_warp_specialize_gemm_copy_0_gemm_1
4.65s call     warp_specialize/test_example_warp_specialize.py::test_example_warp_specialize_gemm_softpipe_stage2
4.62s call     blocksparse_gemm/test_example_blocksparse_gemm.py::test_example_blocksparse_gemm
4.61s call     topk/test_topk_tilelang.py::test_topk_tilelang
4.30s call     blocksparse_attention/test_example_blocksparse_attention.py::test_example_triton_sparse_gqa_decode_varlen_mask
4.23s call     gemm/test_example_gemm.py::test_example_gemm_intrinsics
4.14s call     elementwise/test_example_elementwise.py::test_example_elementwise_add
4.14s call     norm/test_rms_norm.py::test_rms_norm
4.02s call     blocksparse_attention/test_example_blocksparse_attention.py::test_example_triton_sparse_gqa_decode_varlen_indice
4.01s call     dequantize_gemm/test_example_dequantize_gemm.py::test_example_dequant_gemv_fp16xint4
3.72s call     gdn/test_example_gdn_compilation.py::test_example_cumsum_compilation
0.61s call     blocksparse_attention/test_example_blocksparse_attention.py::test_block_sparse_attn_triton
0.03s call     analyze/test_example_analyze.py::test_example_conv_analyze
0.03s call     analyze/test_example_analyze.py::test_example_gemm_analyze

Testing CI Test Time:

================================================================ fixture duration top ================================================================
total          name                                                                                        num  med            min                    
0:00:00.006439 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr                   208 0:00:00.000028          0:00:00.000020
0:00:00.005086 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr                   182 0:00:00.000025          0:00:00.000019
0:00:00.038944 grand total                                                                                 1280 0:00:00.000029          0:00:00.000018
=============================================================== test call duration top ===============================================================
total          name                                                                                        num  med            min                    
0:02:18.910773 testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast       16 0:00:08.538184          0:00:08.029851
0:01:46.428779 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90            12 0:00:09.515094          0:00:00.018699
0:01:29.868598 testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath           21 0:00:04.321056          0:00:03.693514
0:01:27.243296 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr                    16 0:00:05.474963          0:00:04.412709
0:01:15.992348 testing/python/debug/test_tilelang_debug_print.py::test_debug_print_buffer                     1 0:01:15.992348          0:01:15.992348
0:01:14.331675 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr              12 0:00:06.135772          0:00:05.781362
0:01:13.141573 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr                    14 0:00:05.253100          0:00:04.204602
0:01:06.130100 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs              10 0:00:06.671352          0:00:05.074628
0:01:04.966100 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss              11 0:00:05.631887          0:00:04.633975
0:01:02.123817 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr              10 0:00:06.325301          0:00:04.823512
0:00:37.038636 testing/python/jit/test_tilelang_jit_gemm_cython.py::test_cython_dynamic_shape                 1 0:00:37.038636          0:00:37.038636
0:00:35.672505 testing/python/language/test_tilelang_language_reduce.py::test_reduce_sum_clear                1 0:00:35.672505          0:00:35.672505
0:00:32.988014 testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions                       8 0:00:04.070887          0:00:03.906435
0:00:32.091503 testing/python/language/test_tilelang_language_lazy_jit.py::test_jit2_return                   1 0:00:32.091503          0:00:32.091503
0:00:28.219543 testing/python/math/test_math_bitwise_reduce.py::test_bitwise_reduce_ops                       1 0:00:28.219543          0:00:28.219543
0:00:26.006352 testing/python/autotune/test_tilelang_autotune.py::test_autotune_matmul                        1 0:00:26.006352          0:00:26.006352
0:00:25.474265 testing/python/language/test_tilelang_language_vectorize.py::test_vectorize_invariant_index    1 0:00:25.474265          0:00:25.474265
0:00:24.857759 testing/python/language/test_tilelang_language_lazy_jit.py::test_jit2_many_annot               1 0:00:24.857759          0:00:24.857759
0:00:24.716346 testing/python/language/test_tilelang_language_rand.py::test_rand_1d                           3 0:00:08.257283          0:00:08.135095
0:00:21.285440 testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_region_1d                1 0:00:21.285440          0:00:21.285440
0:00:18.875960 testing/python/language/test_tilelang_language_pipeline.py::test_pipeline_order_stage          1 0:00:18.875960          0:00:18.875960
0:00:18.711309 testing/python/fastmath/test_mathops_fastmath.py::test_two_arg_mathops_fastmath                2 0:00:09.355655          0:00:08.956302
0:00:17.558687 testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_smem                     1 0:00:17.558687          0:00:17.558687
0:00:17.340705 testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_region_2d                1 0:00:17.340705          0:00:17.340705
0:00:17.252213 testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_fragment                 1 0:00:17.252213          0:00:17.252213
0:00:17.227597 testing/python/math/test_math_ieee_math.py::test_ieee_sub_all_rounding_modes                   1 0:00:17.227597          0:00:17.227597
0:00:17.205029 testing/python/math/test_math_ieee_math.py::test_ieee_mul_all_rounding_modes                   1 0:00:17.205029          0:00:17.205029
0:00:17.176580 testing/python/language/test_tilelang_language_vectorize.py::test_vectorize                    1 0:00:17.176580          0:00:17.176580
0:00:17.060915 testing/python/math/test_math_ieee_math.py::test_ieee_fdiv_all_rounding_modes                  1 0:00:17.060915          0:00:17.060915
0:00:17.003212 testing/python/language/test_tilelang_language_view.py::test_reshape_view                      1 0:00:17.003212          0:00:17.003212
0:41:22.062133 grand total                                                                                  527 0:00:03.975783          0:00:00.000128
============================================================== test setup duration top ===============================================================
total          name                                                                                        num  med            min                    
0:00:00.016333 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr                    16 0:00:00.001036          0:00:00.000817
0:00:00.013137 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr                    14 0:00:00.000928          0:00:00.000803
0:00:00.012077 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss              11 0:00:00.000874          0:00:00.000822
0:00:00.010821 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr              12 0:00:00.000838          0:00:00.000811
0:00:00.010257 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs              10 0:00:00.001038          0:00:00.000864
0:00:00.010016 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90            12 0:00:00.000828          0:00:00.000763
0:00:00.008982 testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr              10 0:00:00.000879          0:00:00.000818
0:00:00.008018 testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath           21 0:00:00.000345          0:00:00.000323
0:00:00.007218 testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast       16 0:00:00.000460          0:00:00.000379
0:00:00.135999 grand total                                                                                  613 0:00:00.000092 -1 day, 23:59:59.999441
============================================================= test teardown duration top =============================================================
total          name                                                                                        num  med            min                    
0:00:00.074007 grand total                                                                                  613 0:00:00.000099          0:00:00.000053
======================================== slowest durations =========================================
75.99s call     testing/python/debug/test_tilelang_debug_print.py::test_debug_print_buffer
37.04s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_cython_dynamic_shape
35.67s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_sum_clear
32.09s call     testing/python/language/test_tilelang_language_lazy_jit.py::test_jit2_return
28.22s call     testing/python/math/test_math_bitwise_reduce.py::test_bitwise_reduce_ops
26.01s call     testing/python/autotune/test_tilelang_autotune.py::test_autotune_matmul
25.47s call     testing/python/language/test_tilelang_language_vectorize.py::test_vectorize_invariant_index
24.86s call     testing/python/language/test_tilelang_language_lazy_jit.py::test_jit2_many_annot
21.29s call     testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_region_1d
18.88s call     testing/python/language/test_tilelang_language_pipeline.py::test_pipeline_order_stage
17.56s call     testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_smem
17.34s call     testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_region_2d
17.25s call     testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_fragment
17.23s call     testing/python/math/test_math_ieee_math.py::test_ieee_sub_all_rounding_modes
17.21s call     testing/python/math/test_math_ieee_math.py::test_ieee_mul_all_rounding_modes
17.18s call     testing/python/language/test_tilelang_language_vectorize.py::test_vectorize
17.06s call     testing/python/math/test_math_ieee_math.py::test_ieee_fdiv_all_rounding_modes
17.00s call     testing/python/language/test_tilelang_language_view.py::test_reshape_view
16.61s call     testing/python/math/test_math_ieee_math.py::test_ieee_frcp_all_rounding_modes
16.58s call     testing/python/math/test_math_ieee_math.py::test_ieee_add_all_rounding_modes
16.56s call     testing/python/language/test_tilelang_language_clamp.py::test_clamp
16.51s call     testing/python/math/test_math_ieee_math.py::test_ieee_fmaf_all_rounding_modes
16.41s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_sum
16.20s call     testing/python/jit/test_tilelang_jit_tvm_ffi.py::test_tvm_ffi_dynamic_shape
16.18s call     testing/python/math/test_math_ieee_math.py::test_ieee_fsqrt_all_rounding_modes
16.17s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_max
16.13s call     testing/python/language/test_tilelang_language_infinity.py::test_infinity
15.18s call     testing/python/language/test_tilelang_language_pipeline.py::test_blocksparse_matmul
13.90s call     testing/python/kernel/test_tilelang_kernel_gemm_simt.py::test_assert_tl_matmul
13.70s call     testing/python/kernel/test_tilelang_kernel_gemm_mma_intrinsic.py::test_assert_tl_matmul
13.67s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_max_clear
13.05s call     testing/python/language/test_tilelang_language_ptr.py::test_matmul
13.03s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_cython_dynamic_shape_with_out_idx
12.82s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_different_memory_orders
12.73s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_cython_kernel_multi_stream
12.57s call     testing/python/language/test_tilelang_language_copy.py::test_tilelang_copy
12.37s call     testing/python/analysis/test_tilelang_fragment_loop_checker.py::test_valid_loop
12.25s call     testing/python/analysis/test_tilelang_nested_loop_checker.py::test_mixed_sp
12.24s call     testing/python/language/test_tilelang_language_composable_index.py::test_tilelang_copy
12.19s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_cython_kernel_do_bench
12.10s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_gemm_f16f16f16_nn
12.05s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_matmul_int_variable
11.95s call     testing/python/language/test_tilelang_language_ceildiv.py::test_ceildiv
11.62s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-128-128-128-0-128-False-False]
11.56s call     testing/python/jit/test_tilelang_jit_gemm_cython.py::test_matmul_float_variable
10.88s call     testing/python/jit/test_tilelang_jit_nvrtc.py::test_nvrtc_dynamic_shape
10.41s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-128-128-128-2-128-False-False]
10.33s call     testing/python/components/test_tilelang_pass_config_disable_warp_specialized.py::test_gemm_f16f16f16_nn
10.33s call     testing/python/kernel/test_tilelang_kernel_int4_gemm_mma.py::test_assert_tl_matmul_correctness
10.23s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-128-256-2-128-False-False]
10.01s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-128-256-0-128-False-False]
9.93s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float8_e4m3fn-float32-__tl_cvt_fp8x2_to_float2-2]
9.76s call     testing/python/fastmath/test_mathops_fastmath.py::test_two_arg_mathops_fastmath[pow-pow]
9.74s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-64-32-2-128-False-False]
9.64s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float8_e5m2-float32-__tl_cvt_fp8x2_to_float2-4]
9.58s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-64-64-2-128-False-False]
9.46s call     testing/python/kernel/test_tilelang_kernel_fp8_gemm_mma.py::test_assert_tl_matmul
9.45s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-64-32-0-256-False-False]
9.40s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float8_e5m2-float32-__tl_cvt_fp8x2_to_float2-2]
9.33s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-64-64-0-128-False-False0]
9.30s call     testing/python/kernel/test_tilelang_kernel_fp8_gemm.py::test_assert_matmul
9.27s call     testing/python/language/test_tilelang_language_reshape.py::test_reshape_smem_2d_2_1d
9.21s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float16-float32-__half22float2-4]
9.15s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float16-float32-float32-64-64-64-0-128-False-True]
9.06s call     testing/python/language/test_tilelang_language_copy.py::test_tilelang_copy_with_stride
8.98s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-float8_e5m2-__nv_cvt_float2_to_fp8x2-4]
8.97s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-float16-__float22half2_rn-4]
8.96s call     testing/python/fastmath/test_mathops_fastmath.py::test_two_arg_mathops_fastmath[fmod-fmod]
8.93s call     testing/python/language/test_tilelang_language_reshape.py::test_reshape_smem
8.84s call     testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_fragment_1d
8.81s call     testing/python/analysis/test_tilelang_nested_loop_checker.py::test_tiled_op_with_parallel
8.69s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-float8_e4m3fn-__nv_cvt_float2_to_fp8x2-2]
8.69s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-int8-int32-int32-64-64-64-2-128-False-True]
8.67s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[128-128-128-True-True-float8_e5m2-float8_e5m2-float32-128-128-64-2-128]
8.66s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-bfloat16-__float22bfloat162_rn-4]
8.66s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-128-128-True-True-float8_e5m2-float8_e5m2-float32-128-128-64-2-128]
8.55s call     testing/python/language/test_tilelang_language_reshape.py::test_reshape_fragment
8.51s call     testing/python/language/test_tilelang_language_reshape.py::test_reduce_after_reshape
8.50s call     testing/python/analysis/test_tilelang_nested_loop_checker.py::test_nested_parallels
8.50s call     testing/python/analysis/test_tilelang_nested_loop_checker.py::test_nested_serials
8.41s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[bfloat16-float32-__bfloat1622float2-4]
8.41s call     testing/python/language/test_tilelang_language_reshape.py::test_reshape_layout_transform_shared
8.36s call     testing/python/jit/test_tilelang_jit_parcompile.py::test_par_compile
8.32s call     testing/python/language/test_tilelang_language_rand.py::test_rand_1d[128-0]
8.32s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-bfloat16-__float22bfloat162_rn-2]
8.27s call     testing/python/issue/test_tilelang_issue_830.py::test_empty_kernel_with_binding_variants
8.27s call     testing/python/language/test_tilelang_language_reshape.py::test_reshape_smem_1d_2_2d
8.26s call     testing/python/language/test_tilelang_language_rand.py::test_rand_1d[512-123]
8.23s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-float8_e5m2-__nv_cvt_float2_to_fp8x2-2]
8.21s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[bfloat16-float32-__bfloat1622float2-2]
8.20s call     testing/python/kernel/test_tilelang_kernel_gemv_simt.py::test_gemv_simt
8.20s call     testing/python/components/test_storage_rewrite_detect_inplace.py::test_storage_rewrite_detect_inplace_toggle
8.19s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py::test_gemm_sp_sm90[512-1024-768-float8_e4m3fn-float16-float16-64-64-64-2-128-False-True]
8.14s call     testing/python/language/test_tilelang_language_rand.py::test_rand_1d[1024-42]
8.10s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-float16-__float22half2_rn-2]
8.09s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float8_e4m3fn-float32-__tl_cvt_fp8x2_to_float2-4]
8.09s call     testing/python/language/test_tilelang_language_frontend_v2.py::test_serial_for_with_step
8.04s call     testing/python/language/test_tilelang_language_cumsum.py::test_cumsum_smem_1d
8.04s call     testing/python/kernel/test_tilelang_kernel_fp8_gemv_simt.py::test_gemv_simt
8.03s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float16-float32-__half22float2-2]
8.03s call     testing/python/language/test_tilelang_language_vectorized_cast.py::test_vectorized_cast[float32-float8_e4m3fn-__nv_cvt_float2_to_fp8x2-4]
7.87s call     testing/python/issue/test_tilelang_issue_1115.py::test_int64_address
7.54s call     testing/python/kernel/test_tilelang_kernel_gemv_simt.py::test_gemv_simt_fp8
7.53s call     testing/python/language/test_tilelang_language_frontend_v2.py::test_swap_logic
7.09s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[512-1024-768-True-False-float16-float16-float32-128-256-32-2-128]
7.05s call     testing/python/cpu/test_tilelang_cpu_gemm.py::test_matmul_compile
6.95s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[128-128-128-True-True-float8_e5m2-float8_e5m2-float32-128-128-64-2-128]
6.87s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[128-128-128-False-True-int8-int8-int32-128-128-64-2-128]
6.78s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[512-1024-768-False-True-float16-float16-float32-128-256-32-2-128]
6.77s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-128-128-True-True-float8_e5m2-float8_e5m2-float32-128-128-64-2-128]
6.76s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-128-128-False-True-float8_e5m2-float8_e5m2-float32-128-128-64-2-128]
6.76s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[128-128-128-False-False-int8-int8-int32-128-128-64-2-128]
6.62s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[128-128-128-False-False-int8-int8-int32-128-128-128-2-128]
6.58s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[128-128-128-True-True-int8-int8-int32-128-128-64-2-128]
6.55s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[512-1024-768-True-False-float16-float16-float32-128-256-32-2-128]
6.55s call     testing/python/language/test_tilelang_language_atomic_add.py::test_tile_atomic_add
6.52s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[512-1024-768-False-True-float16-float16-float32-128-256-32-2-128]
6.50s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[128-128-128-True-False-int8-int8-int32-128-128-64-2-128]
6.49s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_add
6.45s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[512-1024-768-True-True-float16-float16-float32-128-256-32-2-128]
6.42s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-128-128-False-False-int8-int8-int32-128-128-64-2-128]
6.33s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-128-128-True-False-int8-int8-int32-128-128-64-2-128]
6.30s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[512-1024-768-True-True-float16-float16-float32-128-256-32-2-128]
6.29s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[512-1024-768-False-True-bfloat16-bfloat16-float32-128-256-32-2-128]
6.20s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[512-1024-768-False-False-float16-float16-float32-128-256-32-2-128]
6.18s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[128-128-128-True-True-int8-int8-int32-128-128-64-2-128]
6.16s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-128-True-True-float32-float32-float32-128-128-32-2-128]
6.14s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-128-128-False-True-int8-int8-int32-128-128-64-2-128]
6.13s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[512-1024-768-True-False-float16-float16-float32-128-256-32-2-128]
6.12s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-128-False-False-float32-float32-float32-128-128-32-2-128]
6.12s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-8-128-False-True-int8-int8-int32-128-8-64-2-128]
6.11s call     testing/python/autotune/test_tilelang_autotune_with_inputs.py::test_autotune_matmul_symbolic_m
6.09s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[512-1024-768-False-False-float16-float16-float32-128-256-32-2-128]
6.07s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[512-1024-768-False-True-float16-float16-float32-128-256-32-2-128]
6.03s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-False-True-float32-float32-float32-128-128-32-2-128]
6.00s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-8-128-False-True-float16-float16-float32-128-8-32-2-128]
5.97s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[512-1024-768-True-True-float16-float16-float16-128-256-32-2-128]
5.94s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[512-1024-768-True-True-float16-float16-float32-128-256-32-2-128]
5.94s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[512-1024-768-False-True-bfloat16-bfloat16-float32-128-256-32-2-128]
5.93s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-128-128-True-False-int8-int8-int32-128-128-64-2-128]
5.91s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[128-128-128-True-False-int8-int8-int32-128-128-64-2-128]
5.90s call     testing/python/profiler/test_tilelang_profiler.py::test_profiler
5.88s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[512-1024-768-False-False-float16-float16-float32-128-256-32-2-128]
5.88s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_pad_f16f16f16_nn
5.86s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[512-1024-768-False-True-float16-float16-float16-128-256-32-2-128]
5.82s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[512-1024-768-False-True-float16-float16-float16-128-256-32-2-128]
5.80s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-128-128-False-True-int8-int32-int32-128-128-64-2-128]
5.78s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[128-128-128-False-True-int8-int8-int32-128-128-128-2-128]
5.78s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rr[128-128-128-True-True-int8-int8-int32-128-128-64-2-128]
5.71s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-False-True-int8-int8-int32-128-128-32-2-128]
5.69s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-128-128-False-False-int8-int8-int32-128-128-64-2-128]
5.67s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-False-False-float32-float32-float32-128-128-32-2-128]
5.66s call     testing/python/transform/test_tilelang_transform_config_index_bitwidth.py::test_sta_attention
5.63s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-128-128-True-True-int8-int8-int32-128-128-64-2-128]
5.62s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-128-True-True-float8_e5m2-float8_e5m2-float32-128-128-32-2-128]
5.62s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[512-1024-768-True-False-float16-float16-float32-128-128-32-2-128]
5.60s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[512-1024-768-True-True-float16-float16-float32-128-128-32-2-128]
5.50s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-False-False-int8-int8-int32-128-128-32-2-128]
5.50s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_pad_aligned_f16f16f16_nn
5.50s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-True-False-int8-int8-int32-128-128-32-2-128]
5.47s call     testing/python/language/test_tilelang_language_parallel.py::test_parallel_dynamic_extent
5.46s call     testing/python/language/test_tilelang_language_any_of.py::test_block_sparse_matmul_local
5.45s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[512-1024-768-False-False-float16-float16-float32-128-128-32-2-128]
5.45s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-True-True-float8_e5m2-float8_e5m2-float32-128-128-32-2-128]
5.41s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-True-False-float32-float32-float32-128-128-32-2-128]
5.40s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[512-1024-768-True-True-float16-float16-float16-128-256-32-2-128]
5.39s call     testing/python/analysis/test_tilelang_nested_loop_checker.py::test_mixed_pp
5.38s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[512-1024-768-True-False-float16-float16-float16-128-256-32-2-128]
5.35s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-128-False-True-float32-float32-float32-128-128-32-2-128]
5.33s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-True-True-float32-float32-float32-128-128-32-2-128]
5.31s call     testing/python/language/test_tilelang_language_all_of.py::test_block_sparse_matmul_global
5.31s call     testing/python/analysis/test_tilelang_nested_loop_checker.py::test_nested_pipelines
5.30s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-128-True-False-float32-float32-float32-128-128-32-2-128]
5.28s call     testing/python/language/test_tilelang_language_any_of.py::test_block_sparse_matmul_shared
5.25s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[512-1024-768-False-False-float16-float16-float16-128-256-32-2-128]
5.21s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-32-True-True-int8-int8-int32-128-128-32-2-128]
5.19s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[512-1024-768-False-False-float16-float16-float16-128-256-32-2-128]
5.19s call     testing/python/language/test_tilelang_language_all_of.py::test_block_sparse_matmul_local
5.19s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[512-1024-768-False-True-float16-float16-float32-128-128-32-2-128]
5.13s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[512-1024-768-True-False-float16-float16-float16-128-256-32-2-128]
5.11s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-128-128-True-True-int8-int8-int32-128-128-32-2-128]
5.11s call     testing/python/jit/test_tilelang_jit_callback.py::test_gemm_jit_kernel
5.08s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f16f16f32_nn
5.07s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_rs[128-8-64-False-True-float16-float16-float32-128-8-32-0-128]
5.03s call     testing/python/language/test_tilelang_language_all_of.py::test_block_sparse_matmul_shared
4.99s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f16f16f16_nn
4.98s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_max
4.96s call     testing/python/language/test_tilelang_language_alias.py::test_matmul
4.95s call     testing/python/jit/test_tilelang_jit_tvm_ffi.py::test_tvm_ffi_kernel_do_bench
4.93s call     testing/python/issue/test_tilelang_issue_96.py::test_pipeline_small_matrix
4.92s call     testing/python/language/test_tilelang_language_warp_reduce.py::test_warp_reduce_bitand
4.92s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_max_shared
4.90s call     testing/python/language/test_tilelang_language_let_layout.py::test_blocksparse_copy_cp_async
4.87s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f16f16f16_tn
4.83s call     testing/python/transform/test_tilelang_transform_simplify.py::test_matmul
4.82s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_sr[128-8-64-False-True-float16-float16-float32-128-8-32-0-128]
4.81s call     testing/python/language/test_tilelang_language_lazy_jit.py::test_jit2_gemm_annot
4.77s call     testing/python/debug/test_tilelang_debug_print.py::test_debug_print_buffer_conditional
4.76s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-8-128-False-True-float16-float16-float16-128-8-32-2-128]
4.76s call     testing/python/jit/test_tilelang_jit_gemm.py::test_gemm_f16f16f16_nn_kernel_jit
4.75s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_memory_order
4.74s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[tan-tan]
4.71s call     testing/python/fastmath/test_mathops_fastmath.py::test_abs_maps_to_fabs
4.71s call     testing/python/jit/test_tilelang_jit_nullptr.py::test_nullptr
4.70s call     testing/python/issue/test_tilelang_issue_96.py::test_pipeline_large_matrix
4.69s call     testing/python/language/test_tilelang_language_any_of.py::test_block_sparse_matmul_global
4.67s call     testing/python/language/test_tilelang_language_annotate_safe_value.py::test_tilelang_copy
4.67s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-32-True-False-int8-int8-int32-128-128-32-2-128]
4.66s call     testing/python/issue/test_tilelang_issue_1008.py::test_fill_with_dynamic_region_kernel
4.64s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[sin-sin]
4.64s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f32f32f32_nt
4.64s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__exp-__exp]
4.63s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp_v2.py::test_gemm_ss[128-8-64-False-True-float16-float16-float32-128-8-32-0-128]
4.63s call     testing/python/language/test_tilelang_language_lazy_jit.py::test_jit2_gemm_ptr
4.63s call     testing/python/layout/test_tilelang_layout_fused_replicate.py::test_layout_infer_compiles_and_runs
4.61s call     testing/python/language/test_tilelang_language_clear.py::test_matmul
4.61s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_shuffle_elect_block_leader
4.59s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_pad_f16f16f32_nn
4.58s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[sinh-sinh]
4.57s call     testing/python/language/test_tilelang_language_let_layout.py::test_blocksparse_copy_tma
4.57s call     testing/python/language/test_tilelang_language_parallel.py::test_parallel_static_extent
4.56s call     testing/python/issue/test_tilelang_issue_830.py::test_empty_kernel_lowering
4.56s call     testing/python/language/test_tilelang_language_if_range.py::test_tilelang_if_range
4.55s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_i8i8i32_tn
4.55s call     testing/python/debug/test_tilelang_debug_print.py::test_debug_print_value_conditional
4.55s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_bf16bf16f32_nn
4.55s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f32f32f32_nn
4.54s call     testing/python/language/test_tilelang_language_alloc.py::test_alloc_var
4.54s call     testing/python/kernel/test_tilelang_kernel_gemm_mma_intrinsic.py::test_assert_tl_matmul_bfloat16
4.54s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_min
4.53s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[log2-log2]
4.53s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[exp2-exp2]
4.51s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[sqrt-sqrt]
4.51s call     testing/python/language/test_tilelang_language_annot.py::test_tensor_annot_mul_add
4.50s call     testing/python/language/test_tilelang_language_mask_op.py::test_tilelang_copy_mask_copy_range
4.49s call     testing/python/language/test_tilelang_language_warp_reduce.py::test_warp_reduce_min
4.49s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[ceil-ceil]
4.48s call     testing/python/debug/test_tilelang_debug_print.py::test_debug_print_msg
4.48s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-32-False-False-int8-int8-int32-128-128-32-2-128]
4.47s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[tanh-tanh]
4.47s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_absmax_shared
4.47s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f16f16f16_nt
4.47s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f32f32f32_tn
4.45s call     testing/python/language/test_tilelang_language_warp_reduce.py::test_warp_reduce_bitor
4.45s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-128-32-False-True-int8-int8-int32-128-128-32-2-128]
4.44s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[nearbyint-nearbyint]
4.43s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[erf-erf]
4.43s call     testing/python/jit/test_tilelang_jit_tvm_ffi.py::test_tvm_ffi_im2col_tma_desc
4.42s call     testing/python/debug/test_tilelang_debug_print.py::test_debug_print_register_files
4.42s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_min_shared
4.41s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_rr[128-8-128-False-True-int8-int8-int32-128-8-32-2-128]
4.41s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_warp_group_idx_custom
4.39s call     testing/python/kernel/test_tilelang_kernel_gemm_with_stride.py::test_tilelang_kernel_gemm_with_stride
4.39s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_i8i8i32_nt
4.34s call     testing/python/language/test_tilelang_language_ternary.py::test_tilelang_ternary
4.32s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[atan-atan]
4.30s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[log10-log10]
4.29s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[log-log]
4.29s call     testing/python/kernel/test_tilelang_kernel_gemm.py::test_gemm_f64f64f64_nt
4.28s call     testing/python/kernel/test_tilelang_kernel_element_wise_add.py::test_elementwise_add_i32
4.27s call     testing/python/issue/test_tilelang_issue_830.py::test_empty_with_dead_code_kernel
4.27s call     testing/python/kernel/test_tilelang_kernel_element_wise_add.py::test_elementwise_add_f32f16
4.25s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[cos-cos]
4.24s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_sum_shared
4.24s call     testing/python/language/test_tilelang_language_unroll.py::test_unroll_with_step
4.24s call     testing/python/kernel/test_tilelang_kernel_element_wise_add.py::test_elementwise_add_f32
4.23s call     testing/python/language/test_tilelang_language_reduce.py::test_reduce_abssum_shared
4.23s call     testing/python/language/test_tilelang_language_warp_reduce.py::test_warp_reduce_max
4.22s call     testing/python/language/test_tilelang_language_int64.py::test_fill_symbolic
4.20s call     testing/python/tilelibrary/test_tilelang_tilelibrary_gemm.py::test_gemm_sr[128-8-32-False-True-float16-float16-float16-128-8-32-0-128]
4.20s call     testing/python/debug/test_device_assert.py::test_device_assert_no_trigger
4.20s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__exp10-__exp10]
4.17s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_load_store
4.17s call     testing/python/language/test_tilelang_language_warp_reduce.py::test_warp_reduce_sum
4.16s call     testing/python/language/test_tilelang_language_copy.py::test_tilelang_copy_buffer_load_with_parallel
4.16s call     testing/python/language/test_tilelang_language_mask_op.py::test_tilelang_copy_mask_copy
4.14s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[rsqrt-rsqrt]
4.13s call     testing/python/issue/test_tilelang_issue_814.py::test_issue_814
4.12s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_addx4
4.12s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__cos-__cos]
4.12s call     testing/python/language/test_tilelang_language_mask_op.py::test_tilelang_copy_mask_parallel
4.12s call     testing/python/jit/test_tilelang_jit_callback.py::test_gemm_f16f16f16_nn
4.12s call     testing/python/language/test_tilelang_language_annot.py::test_tensor_annot_mul
4.12s call     testing/python/issue/test_tilelang_issue_1008.py::test_fill_with_static_region_kernel
4.11s call     testing/python/kernel/test_tilelang_kernel_element_wise_add.py::test_elementwise_add_f16
4.08s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__tan-__tan]
4.06s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__log-__log]
4.05s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__log2-__log2]
4.05s call     testing/python/language/test_tilelang_language_annot.py::test_tensor_annot_add
4.04s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_lane_idx_default
4.04s call     testing/python/issue/test_tilelang_issue_1001.py::test_cumsum_view_infer_layout
4.04s call     testing/python/runtime/test_tilelang_runtime_dynamic_shared_memory.py::test_dynamic_shared_memory_varies_across_calls
4.02s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_warp_group_idx_default
4.02s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_return_prev
4.01s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[cosh-cosh]
4.01s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[trunc-trunc]
3.99s call     testing/python/transform/test_nullable_buffer_params.py::test_nullable_shared_shape
3.98s call     testing/python/math/test_math_ieee_math.py::test_ieee_frsqrt_rn_only
3.98s call     testing/python/language/test_tilelang_language_mask_op.py::test_tilelang_copy_mask_parallel_range
3.98s call     testing/python/language/test_tilelang_language_ceildiv.py::test_ceildiv_dyn
3.97s call     testing/python/language/test_tilelang_language_atomic_add.py::test_atomic_addx2
3.93s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__sin-__sin]
3.93s call     testing/python/jit/test_tilelang_jit_tvm_ffi.py::test_tvm_ffi_l2_persistent_map
3.93s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_lane_idx_custom
3.92s call     testing/python/language/test_tilelang_language_alloc.py::test_alloc_var_add
3.91s call     testing/python/fastmath/test_mathops_fastmath.py::test_fastmath_versions[__log10-__log10]
3.89s call     testing/python/language/test_tilelang_language_assume.py::test_assume_enable_vectorization
3.89s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[round-round]
3.86s call     testing/python/language/test_tilelang_capture.py::test_tilelang_capture
3.85s call     testing/python/language/test_tilelang_language_var_init.py::test_var_assign
3.85s call     testing/python/language/test_tilelang_language_chain_equal.py::test_chain_equal
3.85s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[exp10-exp10]
3.83s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_warp_idx_sync_default
3.82s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_shuffle_elect_default
3.81s call     testing/python/language/test_tilelang_language_unroll.py::test_unroll_with_unroll_factor
3.80s call     testing/python/language/test_tilelang_language_alloc.py::test_alloc_var_with_initializer
3.80s call     testing/python/language/test_tilelang_language_copy.py::test_tilelang_copy_bufferload
3.80s call     testing/python/language/test_tilelang_language_int64.py::test_fill_static
3.79s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_warp_idx_custom
3.78s call     testing/python/language/test_tilelang_language_frontend_v2.py::test_while_loop
3.77s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[exp-exp]
3.77s call     testing/python/language/test_tilelang_language_assume.py::test_assume_complex_indexing
3.76s call     testing/python/language/test_tilelang_language_frontend_v2.py::test_var_assign
3.76s call     testing/python/jit/test_tilelang_jit_nvrtc.py::test_gemm_f16f16f16_nn
3.76s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_warp_idx_default
3.76s call     testing/python/language/test_tilelang_language_get_warp_info.py::test_get_warp_idx_sync_custom
3.75s call     testing/python/issue/test_tilelang_issue_1210.py::test_make_packed_api_no_free_loop_var
3.69s call     testing/python/fastmath/test_mathops_fastmath.py::test_mathops_generate_no_fastmath[floor-floor]
3.67s call     testing/python/language/test_tilelang_language_alloc.py::test_alloc_multi_vars_with_initializer
3.67s call     testing/python/language/test_tilelang_language_assume.py::test_assume_remove_boundary_check
3.62s call     testing/python/language/test_tilelang_language_frontend_v2.py::test_frame_inside_macro
3.61s call     testing/python/language/test_tilelang_language_intrinsics_codegen.py::test_language_ldg_codegen
3.27s call     testing/python/jit/test_tilelang_jit_nvrtc.py::test_nvrtc_kernel_do_bench
3.04s call     testing/python/jit/test_tilelang_jit_nvrtc.py::test_nvrtc_im2col_tma_desc
2.75s call     testing/python/jit/test_tilelang_jit_nvrtc.py::test_nvrtc_l2_persistent_map

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions