]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml-cpu: implement MXFP4 SIMD for s390x (#16193)
authorAaron Teo <redacted>
Fri, 26 Sep 2025 10:27:25 +0000 (18:27 +0800)
committerGitHub <redacted>
Fri, 26 Sep 2025 10:27:25 +0000 (13:27 +0300)
commit9b26511857ac09ae69ab485168fe2d3ee5fb1d6e
tree2fcd8cff3f12186f14ac0cbca5ac5160f81371aa
parent00217cd41388328c93e9e9644921c4319bb03bcc
ggml-cpu: implement MXFP4 SIMD for s390x (#16193)

* ggml-cpu: impl mxfp4 s390x

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: missing s = sumf

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: fix incorrect kval_mxfp4 type

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: rework mxfp4

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: missing delta calc

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: fix typo

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: fix typo for vec_splats

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: expand to 2 blocks per loop

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: add unroll to boost perf

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: back to 1 block per loop to test perf

Signed-off-by: Aaron Teo <redacted>
* Revert "ggml-cpu: back to 1 block per loop to test perf"

This reverts commit 1fe55724e2dc295701101bf838bdd4a512237492.

Signed-off-by: Aaron Teo <redacted>
* ggml-cpu: rm unroll from single block

Signed-off-by: Aaron Teo <redacted>
---------

Signed-off-by: Aaron Teo <redacted>
ggml/src/ggml-cpu/arch-fallback.h
ggml/src/ggml-cpu/arch/s390/quants.c