]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
opencl: add flattened q6_K mv (#19054)
authorlhez <redacted>
Tue, 27 Jan 2026 03:36:24 +0000 (19:36 -0800)
committerGitHub <redacted>
Tue, 27 Jan 2026 03:36:24 +0000 (19:36 -0800)
commit94eeb5967c129365f50ca8462a7595ea319430d9
tree954979d2c16993ab984bcba069879d5bee3f14cf
parentb0311c16d2f650a8bd5af652549075b458bd713a
opencl: add flattened q6_K mv (#19054)

* opencl: flatten `q6_K` and add `kernel_mul_mv_q6_K_f32_flat`

* opencl: clean up

* opencl: refactor q6_K mv - put loop body in `block_q_6_K_dot_y_flat`

* opencl: tweak the workgroup size a bit

* opencl: output 4 values per subgroup for `kernel_mul_mv_q6_K_f32_flat`

* opencl: proper alignment for q6_K

* opencl: boundary handling for flattened q6_K mv

* opencl: rename q6_K mv kernel file

* opencl: put flattened q6_K mv in its own file

* opencl: use lower k in file name

* opencl: use K in variable names
ggml/src/ggml-opencl/CMakeLists.txt
ggml/src/ggml-opencl/ggml-opencl.cpp
ggml/src/ggml-opencl/kernels/cvt.cl
ggml/src/ggml-opencl/kernels/mul_mv_q6_k.cl [deleted file]
ggml/src/ggml-opencl/kernels/mul_mv_q6_k_f32.cl [new file with mode: 0644]
ggml/src/ggml-opencl/kernels/mul_mv_q6_k_f32_flat.cl [new file with mode: 0644]