]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
musa: enable fp16 mma (all) and cublas on qy2 (llama/13842)
authorR0CKSTAR <redacted>
Thu, 26 Jun 2025 04:11:59 +0000 (12:11 +0800)
committerGeorgi Gerganov <redacted>
Tue, 1 Jul 2025 08:52:14 +0000 (11:52 +0300)
commitb6f944cbddf3352fc21673fb43b832f537ee2170
treeb0e4f27ed1f2f3c275c0b39207c317088dca4d49
parentb90615706037e5df8e183b51b5963a83fd9d15b7
musa: enable fp16 mma (all) and cublas on qy2 (llama/13842)

* musa: enable fp16 mma (all) and cublas on qy2

Signed-off-by: Xiaodong Ye <redacted>
* Update src/ggml-cuda/ggml-cuda.cu

Co-authored-by: Johannes Gäßler <redacted>
* Address review comments

Signed-off-by: Xiaodong Ye <redacted>
* Address review comments

Signed-off-by: Xiaodong Ye <redacted>
* musa: disable MUL_MAT_ID (q2_k × f32) due to precision issues

Signed-off-by: Xiaodong Ye <redacted>
---------

Signed-off-by: Xiaodong Ye <redacted>
Co-authored-by: Johannes Gäßler <redacted>
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn-wmma-f16.cu
src/ggml-cuda/ggml-cuda.cu
src/ggml-musa/mudnn.cuh