]>
git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
opencl: add flattened q6_K mv (#19054)
* opencl: flatten `q6_K` and add `kernel_mul_mv_q6_K_f32_flat`
* opencl: clean up
* opencl: refactor q6_K mv - put loop body in `block_q_6_K_dot_y_flat`
* opencl: tweak the workgroup size a bit
* opencl: output 4 values per subgroup for `kernel_mul_mv_q6_K_f32_flat`
* opencl: proper alignment for q6_K
* opencl: boundary handling for flattened q6_K mv
* opencl: rename q6_K mv kernel file
* opencl: put flattened q6_K mv in its own file
* opencl: use lower k in file name
* opencl: use K in variable names