]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: MMQ code deduplication + iquant support (#8495)
authorJohannes Gäßler <redacted>
Sat, 20 Jul 2024 20:25:26 +0000 (22:25 +0200)
committerGitHub <redacted>
Sat, 20 Jul 2024 20:25:26 +0000 (22:25 +0200)
commit69c487f4ed57bb4d4514a1b7ff12608d5a8e7ef0
tree36fae873b64c5b64aeb900ba0aa3fcaef280d076
parent07283b1a90e1320aae4762c7e03c879043910252
CUDA: MMQ code deduplication + iquant support (#8495)

* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build
.github/workflows/build.yml
ggml/src/ggml-cuda/mmq.cu
ggml/src/ggml-cuda/mmq.cuh
ggml/src/ggml-cuda/template-instances/generate_cu_files.py
ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/vecdotq.cuh