git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Aman Gupta <redacted>
	Fri, 31 Oct 2025 12:05:07 +0000 (20:05 +0800)
committer	GitHub <redacted>
	Fri, 31 Oct 2025 12:05:07 +0000 (20:05 +0800)
commit	4146d6a1a6228711a487a1e3e9ddd120f8d027d7
tree	bde41ac4448629122097264a9e741b7816877be7	tree
parent	8da3c0e200a586f768ada6f38745acb01380174c	commit \| diff

CUDA: add expert reduce kernel (#16857)

* CUDA: add expert reduce kernel

* contigous checks, better formatting, use std::vector instead of array

* use vector empty instead of size

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Johannes Gäßler <redacted>

ggml/src/ggml-cuda/ggml-cuda.cu		diff \| blob \| history
ggml/src/ggml-cuda/moe-expert-reduce.cu	[new file with mode: 0644]	blob
ggml/src/ggml-cuda/moe-expert-reduce.cuh	[new file with mode: 0644]	blob
tests/test-backend-ops.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom