]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: revise q8_1 data layout for mul_mat_q (#7824)
authorJohannes Gäßler <redacted>
Sun, 9 Jun 2024 07:42:25 +0000 (09:42 +0200)
committerGitHub <redacted>
Sun, 9 Jun 2024 07:42:25 +0000 (09:42 +0200)
commit42b53d192f4e3abf1b7c8e424628424504ea5dc5
tree688fa591663d028dadffc04ff2ee0a22a14b0e59
parent2decf57bc6e4a6b45176c3727d964a01161beecc
CUDA: revise q8_1 data layout for mul_mat_q (#7824)
ggml-cuda.cu
ggml-cuda/mmq.cu
ggml-cuda/mmq.cuh
ggml-cuda/quantize.cu
ggml-cuda/quantize.cuh