]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
HIP: add fattn-mma-f16 for RDNA4 (llama/18481)
authoryulo <redacted>
Tue, 13 Jan 2026 12:52:16 +0000 (20:52 +0800)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 13:56:40 +0000 (15:56 +0200)
commitc6a495ae5da4ccb1158d421ba9355893d005016c
tree0f4a57333c720de8fb6f0f93c39339dfea34b6d1
parent7aa8818647303b567c3a21fe4220b2681988e220
HIP: add fattn-mma-f16 for RDNA4 (llama/18481)

* finish VQ mma

* flash_attn_ext_f16_iter

* KQ_rowsum

* correct exp

* fix scale error

* fix softmax scale

* fix softmax scale

* enable fattn on cpu side

* fix random error

* disable fattn-mma-f16 on rdna3

* fix wrong col for rdna

* use identity mat to transpose

* resolve conflicts

* basic tuning for DeepSeek-R1-Distill-Qwen-1.5B

* fix volta compile error

* align rdna4 policy for fattn

* adjust fattn policy

* adjust kernel selection logic

* update as the review comments

* keep fattn-wmma logic

* adjust kernel selection logic

---------

Co-authored-by: zhang hui <redacted>
Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh
ggml/src/ggml-cuda/fattn.cu
ggml/src/ggml-cuda/mma.cuh
ggml/src/ggml-cuda/vendors/hip.h