]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
k_quants tuning for Falcon-7b (#2816)
authorKawrakow <redacted>
Sun, 27 Aug 2023 12:19:59 +0000 (15:19 +0300)
committerGitHub <redacted>
Sun, 27 Aug 2023 12:19:59 +0000 (15:19 +0300)
commita6d1189fdd4c1ab4ba23f9d777f8950901dcffb2
treed9a96c3adbf57aad406bfb5bd304765615933060
parentc48c5bb0b06385f6c708339188d2aaf2bc278477
k_quants tuning for Falcon-7b (#2816)

* Make ggml-cuda.cu build with QK_K = 64

Using LLAMA_CUDA_FORCE_DMMV = ON and -nommq it runs and produces
a meaningful result.

* k_quants tuning for Falcon-7b

---------

Co-authored-by: Iwan Kawrakow <redacted>
ggml-cuda.cu
llama.cpp