git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Aman Gupta <redacted>
	Tue, 14 Oct 2025 11:15:15 +0000 (19:15 +0800)
committer	GitHub <redacted>
	Tue, 14 Oct 2025 11:15:15 +0000 (13:15 +0200)
commit	48e2fa9fb7c2de1e53808fdb65ec33f916020fc4
tree	916fad74561f5316896a7d57405e65adab7a83df	tree
parent	5b6913c47b6bc71a6f927805a45387d5657d8b89	commit \| diff

CUDA: add fp kernel for larger batch size MoE (#16512)

* CUDA: kernel for larger batch sizes for MoE

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* fixup

* tests

* Move mmq_ids_helper to mmid

* cleanup

* Remove redundant checks

ggml/src/ggml-cuda/mmf.cu		diff \| blob \| history
ggml/src/ggml-cuda/mmf.cuh		diff \| blob \| history
ggml/src/ggml-cuda/mmid.cu	[new file with mode: 0644]	blob
ggml/src/ggml-cuda/mmid.cuh	[new file with mode: 0644]	blob
ggml/src/ggml-cuda/mmq.cu		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom