]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: MMQ code deduplication + iquant support (llama/8495)
authorJohannes Gäßler <redacted>
Sat, 20 Jul 2024 20:25:26 +0000 (22:25 +0200)
committerGeorgi Gerganov <redacted>
Thu, 8 Aug 2024 19:48:46 +0000 (22:48 +0300)
commit8c4f30497a59cd5374a16deb24493e275c468df8
treeef78b55d33c5d531930d977730238b5e2eacf6d6
parentb1ee3a844413c81b860b664434070b36c8ce39af
CUDA: MMQ code deduplication + iquant support (llama/8495)

* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build
ggml/src/ggml-cuda/mmq.cu
ggml/src/ggml-cuda/mmq.cuh
ggml/src/ggml-cuda/template-instances/generate_cu_files.py
ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu [new file with mode: 0644]
ggml/src/ggml-cuda/vecdotq.cuh