]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (llama/16739)
authorAlberto Cabrera Pérez <redacted>
Mon, 24 Nov 2025 11:08:11 +0000 (11:08 +0000)
committerGeorgi Gerganov <redacted>
Thu, 11 Dec 2025 13:32:44 +0000 (15:32 +0200)
commitdd0edcb1139b31ec049d8d0e4cc9eb7eb18d65e8
tree7c5c8f934289ed51ae2174d2dd505cf6107e52d4
parente1c20c1f9b177bdba31c8564ba0651fff3c8cbde
ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (llama/16739)

* Enabled q4_K_8x8_q8_K path on ARM

* wip: I8mm qs multiplication, pending bias

* cpu : arm : REPACK gemm q4_K8x8 implementation

Signed-off-by: Alberto Cabrera <redacted>
* Guard gemm with proper features, improved superblock scale and min calc

Signed-off-by: Alberto Cabrera <redacted>
* cpu: arm: Implemented REPACK gemv for Q4_K

Signed-off-by: Alberto Cabrera <redacted>
* Removed completed TODO

* Fixed missing guards when selecting optimal repack type for Q4_K

Signed-off-by: Alberto Cabrera <redacted>
* Fixed macro guard for gemv

* Fixed wrong comment in GEMV

* Fixed warning for unused variable

* vdotq_s32 -> ggml_vdotq_s32

Signed-off-by: Alberto Cabrera <redacted>
* Clang-format issues

* Apply suggestions from code review

Co-authored-by: Diego Devesa <redacted>
* Removed unnecessary GGML_UNUSED

* Fixed guards in q4_k gemm and gemv (repack)

---------

Signed-off-by: Alberto Cabrera <redacted>
Co-authored-by: Diego Devesa <redacted>
src/ggml-cpu/arch-fallback.h
src/ggml-cpu/arch/arm/repack.cpp
src/ggml-cpu/repack.cpp