git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

overview / pkg / ggml / sources / whisper.cpp / commit

author	Aman Gupta <redacted>
	Sun, 25 Jan 2026 15:25:58 +0000 (23:25 +0800)
committer	Georgi Gerganov <redacted>
	Fri, 30 Jan 2026 13:56:40 +0000 (15:56 +0200)
commit	1642a4fb605179844ade8e0782bda04272bd2897
tree	a4eabb0dc7b4c86c721af16cb7ac9d591db52062	tree
parent	d2b51404e482a0dc42de1d17e19d164e51f2dedf	commit \| diff

ggml-cpu: Use tiled FA for prompt-processing (llama/19012)

* ggml-cpu: Use tiled FA for prompt-processing

the FA performance is gimped on CPU on long contexts because it essentially uses a vector kernel. This PR adds a tiled FA for PP. Perf tuning for tile sizes done on a AMD EPYC single-socket 64-c machine.

* fix out of bounds for mask

* skip rows where there are all masks

* skip tile if mask is inf

* store mask in worksize

* check inf tile earlier

ggml/src/ggml-cpu/common.h		diff \| blob \| history
ggml/src/ggml-cpu/ggml-cpu.c		diff \| blob \| history
ggml/src/ggml-cpu/ops.cpp		diff \| blob \| history

Packaging of ggerganov/whisper.cpp

RSS Atom