]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (llama/13199)
authorJohannes Gäßler <redacted>
Wed, 30 Apr 2025 21:12:59 +0000 (23:12 +0200)
committerGeorgi Gerganov <redacted>
Thu, 1 May 2025 10:29:02 +0000 (13:29 +0300)
commitd052e64d42d95fc4d3bb08dc097d004471a52db9
tree5d18d4711f0bfbc92514694a451fd45f17286eb7
parent780750a10849e48f3a7a0ac027746d3e737b1756
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (llama/13199)
ggml/src/ggml-cuda/getrows.cu
ggml/src/ggml-cuda/getrows.cuh
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/mmq.cu
ggml/src/ggml-cuda/mmq.cuh
ggml/src/ggml-cuda/mmvq.cu
ggml/src/ggml-cuda/quantize.cu
ggml/src/ggml-cuda/quantize.cuh