git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Aman Gupta <redacted>
	Sat, 9 Aug 2025 12:00:24 +0000 (20:00 +0800)
committer	GitHub <redacted>
	Sat, 9 Aug 2025 12:00:24 +0000 (20:00 +0800)
commit	34c9d765bf173c551398f1e7fa4595019bc53bab
tree	254d53d51309ff63ba921557c37e0a58920d7914	tree
parent	e54d41befcc1575f4c898c5ff4ef43970cead75f	commit \| diff

CUDA: add attention sinks for tile and wmma (#15178)

* CUDA: add attention sinks for tile and wmma

* Review: formatting changes + remove syncthreads from tile + remove warp_reduce_max from wmma

ggml/src/ggml-cuda/fattn-tile-f16.cu		diff \| blob \| history
ggml/src/ggml-cuda/fattn-tile-f32.cu		diff \| blob \| history
ggml/src/ggml-cuda/fattn-wmma-f16.cu		diff \| blob \| history
ggml/src/ggml-cuda/fattn.cu		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom