]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
HIP: add fattn-mma-f16 for RDNA4 (#18481)
authoryulo <redacted>
Tue, 13 Jan 2026 12:52:16 +0000 (20:52 +0800)
committerGitHub <redacted>
Tue, 13 Jan 2026 12:52:16 +0000 (13:52 +0100)
commitea4a321f2a607ca315d998f0656fd255715884a6
tree4dd5164448f0f3ad091643f7d55d1a0b9693fa0c
parentc1e79e610fd28f2c3923539fee9313734bbf8cfa
HIP: add fattn-mma-f16 for RDNA4 (#18481)

* finish VQ mma

* flash_attn_ext_f16_iter

* KQ_rowsum

* correct exp

* fix scale error

* fix softmax scale

* fix softmax scale

* enable fattn on cpu side

* fix random error

* disable fattn-mma-f16 on rdna3

* fix wrong col for rdna

* use identity mat to transpose

* resolve conflicts

* basic tuning for DeepSeek-R1-Distill-Qwen-1.5B

* fix volta compile error

* align rdna4 policy for fattn

* adjust fattn policy

* adjust kernel selection logic

* update as the review comments

* keep fattn-wmma logic

* adjust kernel selection logic

---------

Co-authored-by: zhang hui <redacted>
Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh
ggml/src/ggml-cuda/fattn.cu
ggml/src/ggml-cuda/mma.cuh
ggml/src/ggml-cuda/vendors/hip.h