git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

author	Johannes Gäßler <redacted>
	Mon, 17 Feb 2025 13:03:24 +0000 (14:03 +0100)
committer	Georgi Gerganov <redacted>
	Tue, 25 Feb 2025 11:33:09 +0000 (13:33 +0200)
commit	929463763595d8101f8d1132b17fb850849eec4d
tree	d1ae13bbd527d2ba8bd2f2dfc620d1fd5f87527c	tree
parent	a80831e6e96ea9bbb281795e2cbcd677da9e0146	commit \| diff

CUDA: use async data loading for FlashAttention (llama/11894)

* CUDA: use async data loading for FlashAttention

---------

Co-authored-by: Diego Devesa <redacted>

src/ggml-cuda/common.cuh		diff \| blob \| history
src/ggml-cuda/cp-async.cuh	[new file with mode: 0644]	blob
src/ggml-cuda/fattn-common.cuh		diff \| blob \| history
src/ggml-cuda/fattn-mma-f16.cuh		diff \| blob \| history
src/ggml-cuda/mma.cuh		diff \| blob \| history
src/ggml-cuda/mmq.cuh		diff \| blob \| history

Packaging of ggml-org/ggml

RSS Atom