git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

author	Max Krasnyansky <redacted>
	Sat, 14 Feb 2026 00:27:30 +0000 (16:27 -0800)
committer	Georgi Gerganov <redacted>
	Sun, 15 Feb 2026 19:44:37 +0000 (21:44 +0200)
commit	e6476d4c12f8e921bea9be6e0f65f4e07cbe08e3
tree	74c85376aa3132390e94781d0a4f6700c037e01c	tree
parent	ec57bf407cb1b02998bde2b395f27eb96b0e9bc8	commit \| diff

hexagon: further optimizations and refactoring for flash attention (llama/19583)

* ggml-hexagon: fa improvements

ggml-hexagon: optimize flash attention calculations with improved variable handling

ggml-hexagon: streamline flash attention operations by removing redundant checks for FP32

ggml-hexagon: optimize hvx_dot_f16_f16_aa_rx2 by simplifying variable handling for unused elements

ggml-hexagon: optimize flash attention by changing slope vector type to F16

* hexfa: fixed test-backend-ops failurs due to leftover element handling

* hexagon: refactor and optimize fa to use local context struct

* ggml-hexagon: optimize flash-attention using hvx_vec_expf

Use HVX for online softmax.

---------

Co-authored-by: chraac <redacted>