]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: MMQ support for iq4_nl, iq4_xs (#8278)
authorJohannes Gäßler <redacted>
Fri, 5 Jul 2024 07:06:31 +0000 (09:06 +0200)
committerGitHub <redacted>
Fri, 5 Jul 2024 07:06:31 +0000 (09:06 +0200)
commit8e558309dc149dc1f9fd159185b0b9071527ffb5
treea91d3dbc1b50e1ef4eff8bd5d568817ce6c08f94
parent0a423800ffe4e5da3d83527ef3473da88cd78146
CUDA: MMQ support for iq4_nl, iq4_xs (#8278)
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/mmq.cu
ggml/src/ggml-cuda/mmq.cuh
ggml/src/ggml-cuda/template-instances/generate_cu_files.py
ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/vecdotq.cuh