]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: refactor and optimize IQ MMVQ (llama/8215)
authorJohannes Gäßler <redacted>
Mon, 1 Jul 2024 18:39:06 +0000 (20:39 +0200)
committerGeorgi Gerganov <redacted>
Mon, 8 Jul 2024 10:03:28 +0000 (13:03 +0300)
commitd1bbf97fcf4d5e93e2d6933a6778ed8acac28fed
tree206d68459959580478aa7da3f7a8cb3a66ce9a5f
parenta3a4e8eb54ae7830c04e4e545d0ecd85327c815b
CUDA: refactor and optimize IQ MMVQ (llama/8215)

* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
src/ggml-common.h
src/ggml-cuda.cu
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn-common.cuh
src/ggml-cuda/mmvq.cu
src/ggml-cuda/vecdotq.cuh
src/ggml-sycl/mmvq.cpp
src/ggml-sycl/vecdotq.hpp