]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
ggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama/18934)
authorAman Gupta <redacted>
Sat, 24 Jan 2026 06:25:20 +0000 (14:25 +0800)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 13:56:40 +0000 (15:56 +0200)
commit13577a6ce4496aa3857dc6c878a4029c05ed7e69
tree4ef2d7639a699cef5c1c72b5762d90a977833d93
parent79f1bb3d355c198e390f622555cb22225683a2bf
ggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama/18934)

* ggml-cuda: add split-wise cuda graph

* add n-cpu-moe compare_llama_bench.py

* fix hip/musa builds
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/mean.cu