]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
cuda : enable CUDA graphs for MMID 1 <= BS <= 4 (llama/19645)
authorGeorgi Gerganov <redacted>
Tue, 17 Feb 2026 10:31:49 +0000 (12:31 +0200)
committerGeorgi Gerganov <redacted>
Wed, 25 Feb 2026 10:32:13 +0000 (12:32 +0200)
commit23c1635e020550549e877aa38ab8ada84f336a4b
tree33aa963ae679b93396ffa4370a3742937e76e957
parenta3092f95b28ebc061d5f7bf3fb38e4203db1c17e
cuda : enable CUDA graphs for MMID 1 <= BS <= 4 (llama/19645)

* cuda : enable CUDA graphs for MMID BS <= 4

* cont : add stream capture check

Co-authored-by: Oliver Simons <redacted>
* cont : add MMVQ_MMID_MAX_BATCH_SIZE

---------

Co-authored-by: Oliver Simons <redacted>
src/ggml-cuda/ggml-cuda.cu
src/ggml-cuda/mmvq.cuh