]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (#19126)
authorAman Gupta <redacted>
Thu, 29 Jan 2026 02:31:28 +0000 (10:31 +0800)
committerGitHub <redacted>
Thu, 29 Jan 2026 02:31:28 +0000 (10:31 +0800)
commit3bcc990997f201114ee6b6abdec5eb43683d7af2
tree828ed2515b270798a819744ce1908295569307e6
parentd4964a7c66c4ff935d86c9ac92abeb12073723bf
CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (#19126)
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/topk-moe.cu
ggml/src/ggml-cuda/topk-moe.cuh