]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: refactor and optimize IQ MMVQ (#8215)
authorJohannes Gäßler <redacted>
Mon, 1 Jul 2024 18:39:06 +0000 (20:39 +0200)
committerGitHub <redacted>
Mon, 1 Jul 2024 18:39:06 +0000 (20:39 +0200)
commitcb5fad4c6c2cbef92e9b8b63449e1cb7664e4846
tree462520fd21f3ce9142b61a4b1e700fea577d58f1
parentdae57a1ebc1c9bd5693ab999e19d77c5506ae559
CUDA: refactor and optimize IQ MMVQ (#8215)

* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
ggml/src/ggml-common.h
ggml/src/ggml-cuda.cu
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/mmvq.cu
ggml/src/ggml-cuda/vecdotq.cuh
ggml/src/ggml-sycl/mmvq.cpp
ggml/src/ggml-sycl/vecdotq.hpp