]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (llama/16739)
authorAlberto Cabrera Pérez <redacted>
Mon, 24 Nov 2025 11:08:11 +0000 (11:08 +0000)
committerGeorgi Gerganov <redacted>
Fri, 12 Dec 2025 15:53:07 +0000 (17:53 +0200)
commitf4ede89d24ae06af1e3b458a565c6f37012144b7
treed57d14730819335af7c0e87196750c84e91af2a7
parentfaf37ffe763fa9e271dafce780bcae0b5d21d3f8
ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (llama/16739)

* Enabled q4_K_8x8_q8_K path on ARM

* wip: I8mm qs multiplication, pending bias

* cpu : arm : REPACK gemm q4_K8x8 implementation

Signed-off-by: Alberto Cabrera <redacted>
* Guard gemm with proper features, improved superblock scale and min calc

Signed-off-by: Alberto Cabrera <redacted>
* cpu: arm: Implemented REPACK gemv for Q4_K

Signed-off-by: Alberto Cabrera <redacted>
* Removed completed TODO

* Fixed missing guards when selecting optimal repack type for Q4_K

Signed-off-by: Alberto Cabrera <redacted>
* Fixed macro guard for gemv

* Fixed wrong comment in GEMV

* Fixed warning for unused variable

* vdotq_s32 -> ggml_vdotq_s32

Signed-off-by: Alberto Cabrera <redacted>
* Clang-format issues

* Apply suggestions from code review

Co-authored-by: Diego Devesa <redacted>
* Removed unnecessary GGML_UNUSED

* Fixed guards in q4_k gemm and gemv (repack)

---------

Signed-off-by: Alberto Cabrera <redacted>
Co-authored-by: Diego Devesa <redacted>
ggml/src/ggml-cpu/arch-fallback.h
ggml/src/ggml-cpu/arch/arm/repack.cpp
ggml/src/ggml-cpu/repack.cpp