git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	lhez <redacted>
	Tue, 27 Jan 2026 03:36:24 +0000 (19:36 -0800)
committer	GitHub <redacted>
	Tue, 27 Jan 2026 03:36:24 +0000 (19:36 -0800)
commit	94eeb5967c129365f50ca8462a7595ea319430d9
tree	954979d2c16993ab984bcba069879d5bee3f14cf	tree
parent	b0311c16d2f650a8bd5af652549075b458bd713a	commit \| diff

opencl: add flattened q6_K mv (#19054)

* opencl: flatten `q6_K` and add `kernel_mul_mv_q6_K_f32_flat`

* opencl: clean up

* opencl: refactor q6_K mv - put loop body in `block_q_6_K_dot_y_flat`

* opencl: tweak the workgroup size a bit

* opencl: output 4 values per subgroup for `kernel_mul_mv_q6_K_f32_flat`

* opencl: proper alignment for q6_K

* opencl: boundary handling for flattened q6_K mv

* opencl: rename q6_K mv kernel file

* opencl: put flattened q6_K mv in its own file

* opencl: use lower k in file name

* opencl: use K in variable names

ggml/src/ggml-opencl/CMakeLists.txt		diff \| blob \| history
ggml/src/ggml-opencl/ggml-opencl.cpp		diff \| blob \| history
ggml/src/ggml-opencl/kernels/cvt.cl		diff \| blob \| history
ggml/src/ggml-opencl/kernels/mul_mv_q6_k.cl	[deleted file]	blob \| history
ggml/src/ggml-opencl/kernels/mul_mv_q6_k_f32.cl	[new file with mode: 0644]	blob
ggml/src/ggml-opencl/kernels/mul_mv_q6_k_f32_flat.cl	[new file with mode: 0644]	blob