git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

author	mahorozte <redacted>
	Tue, 3 Dec 2024 13:11:43 +0000 (21:11 +0800)
committer	GitHub <redacted>
	Tue, 3 Dec 2024 13:11:43 +0000 (14:11 +0100)
commit	b903ffe79daf18c0aaacbebe44a7b93a6b8d0982
tree	673cc9be84890467a7100a832796f51bf54dff24	tree
parent	589fed13a77d7e54435c2182384955706b60b841	commit \| diff

CUDA: remove unnecessary warp reduce in FA (#1032)

* kqmax_new_j in every thread within warp is same after operate at line 199,this reduce can be omit

* same problem in vec32

---------

Co-authored-by: ZhaoXiaoYu <redacted>

src/ggml-cuda/fattn-vec-f16.cuh		diff \| blob \| history
src/ggml-cuda/fattn-vec-f32.cuh		diff \| blob \| history

Packaging of ggml-org/ggml

RSS Atom