git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

author	Aman Gupta <redacted>
	Tue, 3 Feb 2026 15:31:23 +0000 (23:31 +0800)
committer	Georgi Gerganov <redacted>
	Sat, 7 Feb 2026 08:37:38 +0000 (10:37 +0200)
commit	0622f36a396d290967f48407f5dda31111798d03
tree	50c177fb8993dded9aac2c05ab805544cc8259a8	tree
parent	f6f23e63cbcf8a759b44dc1aa6ad72f786171a32	commit \| diff

CUDA: use mmvq for mul-mat-id for small batch sizes (llama/18958)

* CUDA: use mmvq for mul-mat-id for small batch sizes

* add mmvq too

* Fix perf issue on ampere. Use mmvf mm-id only for non-nvidia GPUs

* templatize multi_token_path

src/ggml-cuda/ggml-cuda.cu		diff \| blob \| history
src/ggml-cuda/mmvf.cu		diff \| blob \| history
src/ggml-cuda/mmvf.cuh		diff \| blob \| history
src/ggml-cuda/mmvq.cu		diff \| blob \| history

Packaging of ggml-org/ggml

RSS Atom