]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama/18934)
authorAman Gupta <redacted>
Sat, 24 Jan 2026 06:25:20 +0000 (14:25 +0800)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 11:49:29 +0000 (13:49 +0200)
commita1f19063f9b8e25a879d778f0baf6ed99a10e01a
tree1941b954dc56a22337ac5c856ba166ed7a4295ed
parent88b0e186b689fd7e95581fd29a81b969b2a23dae
ggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama/18934)

* ggml-cuda: add split-wise cuda graph

* add n-cpu-moe compare_llama_bench.py

* fix hip/musa builds
src/ggml-cuda/common.cuh
src/ggml-cuda/ggml-cuda.cu
src/ggml-cuda/mean.cu