]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: add fp kernel for larger batch size MoE (llama/16512)
authorAman Gupta <redacted>
Tue, 14 Oct 2025 11:15:15 +0000 (19:15 +0800)
committerGeorgi Gerganov <redacted>
Wed, 15 Oct 2025 06:29:17 +0000 (09:29 +0300)
commitb4c5c6f71fec4c04a1805df73588f52d316f31ce
tree2fc8abce68a9b4d9762bf34ec05b8d622aa9b3a4
parenta12848e8e9ac10286d976253e5c0eb8e382ea36d
CUDA: add fp kernel for larger batch size MoE (llama/16512)

* CUDA: kernel for larger batch sizes for MoE

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* fixup

* tests

* Move mmq_ids_helper to mmid

* cleanup

* Remove redundant checks
ggml/src/ggml-cuda/mmf.cu
ggml/src/ggml-cuda/mmf.cuh
ggml/src/ggml-cuda/mmid.cu [new file with mode: 0644]
ggml/src/ggml-cuda/mmid.cuh [new file with mode: 0644]
ggml/src/ggml-cuda/mmq.cu