]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: fastdiv, launch bounds for mmvq + q8_1 quant (llama/15802)
authorJohannes Gäßler <redacted>
Fri, 5 Sep 2025 14:07:02 +0000 (16:07 +0200)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:33:50 +0000 (13:33 +0300)
commit72febac2c8059eb5f90b58c811c7b38d35b2c153
tree88c6f0d31f1129bff11ee83742070e54fcb492ca
parent220f851f182a933836bcc4ca5ae870498ef17a6e
CUDA: fastdiv, launch bounds for mmvq + q8_1 quant (llama/15802)

* CUDA: fastdiv, launch bounds for mmvq + q8_1 quant
src/ggml-cuda/common.cuh
src/ggml-cuda/mmvq.cu
src/ggml-cuda/quantize.cu