]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml : refactor online repacking (llama/10446)
authorDjip007 <redacted>
Sat, 7 Dec 2024 12:37:50 +0000 (13:37 +0100)
committerGeorgi Gerganov <redacted>
Tue, 10 Dec 2024 16:33:03 +0000 (18:33 +0200)
commitb3a469a1f10b4501cd765949ddba025cff6fffac
tree9643dabd9574789185dae542a6155c15df484b25
parent72f04e91375ca8a4350ba5afe739308b3b20ba18
ggml : refactor online repacking (llama/10446)

* rename ggml-cpu-aarch64.c to .cpp

* reformat extra cpu backend.

- clean Q4_0_N_M and IQ4_0_N_M
  - remove from "file" tensor type
  - allow only with dynamic repack

- extract cpu extra bufts and convert to C++
  - hbm
  - "aarch64"

- more generic use of extra buffer
  - generalise extra_supports_op
  - new API for "cpu-accel":
     - amx
     - aarch64

* clang-format

* Clean Q4_0_N_M ref

Enable restrict on C++

* add op GGML_OP_MUL_MAT_ID for Q4_0_N_M with runtime repack

* added/corrected control on tensor size for Q4 repacking.

* Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

Co-authored-by: Georgi Gerganov <redacted>
* Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

Co-authored-by: Georgi Gerganov <redacted>
* add debug logs on repacks.

---------

Co-authored-by: Georgi Gerganov <redacted>
23 files changed:
include/ggml-cpu.h
include/ggml.h
src/CMakeLists.txt
src/ggml-cann/ggml-cann.cpp
src/ggml-common.h
src/ggml-cpu/CMakeLists.txt
src/ggml-cpu/amx/amx.cpp
src/ggml-cpu/amx/amx.h
src/ggml-cpu/amx/common.h
src/ggml-cpu/amx/mmq.cpp
src/ggml-cpu/amx/mmq.h
src/ggml-cpu/ggml-cpu-aarch64.cpp [new file with mode: 0644]
src/ggml-cpu/ggml-cpu-aarch64.h
src/ggml-cpu/ggml-cpu-hbm.cpp [new file with mode: 0644]
src/ggml-cpu/ggml-cpu-hbm.h [new file with mode: 0644]
src/ggml-cpu/ggml-cpu-traits.cpp [new file with mode: 0644]
src/ggml-cpu/ggml-cpu-traits.h [new file with mode: 0644]
src/ggml-cpu/ggml-cpu.c
src/ggml-cpu/ggml-cpu.cpp
src/ggml-cuda/ggml-cuda.cu
src/ggml-quants.c
src/ggml-sycl/ggml-sycl.cpp
src/ggml.c