Skip to content

Commit 5faa29c

Browse files
hiworldwzjwangzaijunWANDY666
authored
diverse mode fast gen decode kernel. (#1123)
Co-authored-by: wangzaijun <wangzaijun@sensetime.com> Co-authored-by: WANDY666 <1060304770@qq.com>
1 parent fbd13bb commit 5faa29c

File tree

53 files changed

+1907
-47
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+1907
-47
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}

0 commit comments

Comments
 (0)