]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: refactor and optimize IQ MMVQ (llama/8215)
authorJohannes Gäßler <redacted>
Mon, 1 Jul 2024 18:39:06 +0000 (20:39 +0200)
committerGeorgi Gerganov <redacted>
Mon, 8 Jul 2024 11:53:55 +0000 (14:53 +0300)
commite4bc83ab47f5bfd8871022c03ef732cdcbc36dcc
treee89643c1913882c1f03a8f3bfcb50a4f15af3a28
parentdb7e0dbe6e6f9e1586d7a618759bfbbb23c1b8d5
CUDA: refactor and optimize IQ MMVQ (llama/8215)

* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
ggml/src/ggml-common.h
ggml/src/ggml-cuda.cu
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/mmvq.cu
ggml/src/ggml-cuda/vecdotq.cuh