git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Johannes Gäßler <redacted>
	Wed, 20 Dec 2023 14:41:22 +0000 (15:41 +0100)
committer	GitHub <redacted>
	Wed, 20 Dec 2023 14:41:22 +0000 (15:41 +0100)
commit	799fc2268989482054944c902874cca76337580f
tree	f535df08f2059a709f8f5b8014d532f1aa086a2d	tree
parent	328b83de23b33240e28f4e74900d1d06726f5eb1	commit \| diff

CUDA: Faster Mixtral prompt processing (#4538)

* CUDA: make MoE tensors contiguous for batch size>1

* Update ggml-cuda.cu

Co-authored-by: slaren <redacted>
---------

Co-authored-by: slaren <redacted>

ggml-cuda.cu

diff | blob | history

Packaging of ggml-org/llama.cpp

RSS Atom