]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: MMQ code deduplication + iquant support (llama/8495)
authorJohannes Gäßler <redacted>
Sat, 20 Jul 2024 20:25:26 +0000 (22:25 +0200)
committerGeorgi Gerganov <redacted>
Sat, 27 Jul 2024 15:26:12 +0000 (18:26 +0300)
commitfad9ac1e39b65a07ffe768b71cdb29c2d284becb
treea0ca7e1fb59ad83e36f6ee12704fd505fe6a79fd
parent20f5a13715470ea442a0522a6735a4e25d9450df
CUDA: MMQ code deduplication + iquant support (llama/8495)

* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build
src/ggml-cuda/mmq.cu
src/ggml-cuda/mmq.cuh
src/ggml-cuda/template-instances/generate_cu_files.py
src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu [new file with mode: 0644]
src/ggml-cuda/vecdotq.cuh