]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
opencl: add q6_K gemm and gemv kernels for Adreno (llama/20089)
authorlhez <redacted>
Mon, 23 Mar 2026 19:44:18 +0000 (12:44 -0700)
committerGeorgi Gerganov <redacted>
Sat, 28 Mar 2026 11:39:09 +0000 (13:39 +0200)
commit6b53cddbae903a38eb88ebd00406356dd32b5600
tree6172511ee9f97595f1cc7b5290d984dc18a0156a
parent35b156c996764bc9c98c851fa676541c29f7e8d1
opencl: add q6_K gemm and gemv kernels for Adreno (llama/20089)

* opencl: add q6_K noshuffle kernels, initial q6_K gemv, some host code

* opencl: add q6_K transpose

* opencl: fix cvt kernel name

* opencl: add call to q6_K gemv

* opencl: fix q6_K scale transpose

* opencl: fix loading for gemv q6_K, refactor

* opencl: fix transpose_8_buf kernel assignment, refactor

* opencl: refactor q6_K transpose

* opencl: add gemm_noshuffle_q6_k_f32

* opencl: fix qh loading

* opencl: refactor q6_K gemv host side, release bufs and imgs

* opencl: refactor

* opencl: fix q6_K dequant and scale selection

* opencl: workaround compiler bug, fix dump_tensor

* opencl: refactor q6_K convert kernels

* opencl: unpack transformed q6_K in get_tensor

* opencl: refactor, handle non-uniform workgroups

* opencl: support non-vector subgroup bcast
src/ggml-opencl/CMakeLists.txt
src/ggml-opencl/ggml-opencl.cpp
src/ggml-opencl/kernels/cvt.cl
src/ggml-opencl/kernels/gemm_noshuffle_q6_k_f32.cl [new file with mode: 0644]
src/ggml-opencl/kernels/gemv_noshuffle_q6_k_f32.cl [new file with mode: 0644]