]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Porting the improved K-Quant CUDA kernels to OpenCL (#1966)
authorLostRuins <redacted>
Thu, 29 Jun 2023 03:56:43 +0000 (11:56 +0800)
committerGitHub <redacted>
Thu, 29 Jun 2023 03:56:43 +0000 (05:56 +0200)
commit96a712ca1b7f427e3bd7ffc0c70b2105cfc7fbf1
tree448ac4c00677b54d68272bc4f5310bc5ebe85f02
parentd3494bb86bf7ad5b0b60aae0220ea576f273b5c0
Porting the improved K-Quant CUDA kernels to OpenCL (#1966)

* Added broken new q4k quant

* xx + ib0

* Fix q2_k fast kernel

* Use preprocessor for QK_K

* Add q6_k fast matmul kernel

* ported q3k speedup successfully

* ported q2k and q5k speedups

* remove old dot kernels and template

* fixed global const struct types

* fixing address spaces

* fixed string too long CI issue

---------

Co-authored-by: 0cc4m <redacted>
ggml-opencl.cpp