]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
ggml : refactor online repacking (llama/10446)
authorDjip007 <redacted>
Sat, 7 Dec 2024 12:37:50 +0000 (13:37 +0100)
committerGeorgi Gerganov <redacted>
Wed, 18 Dec 2024 10:52:16 +0000 (12:52 +0200)
commite990d1b791e7bda546866e82e094a8969ea86c6d
tree2b259ffa1d8f91c7d025dd321abb4eba0f37fff4
parent4a6d52efe6ee4ba7702434dfe5f8e7c67a4ebf96
ggml : refactor online repacking (llama/10446)

* rename ggml-cpu-aarch64.c to .cpp

* reformat extra cpu backend.

- clean Q4_0_N_M and IQ4_0_N_M
  - remove from "file" tensor type
  - allow only with dynamic repack

- extract cpu extra bufts and convert to C++
  - hbm
  - "aarch64"

- more generic use of extra buffer
  - generalise extra_supports_op
  - new API for "cpu-accel":
     - amx
     - aarch64

* clang-format

* Clean Q4_0_N_M ref

Enable restrict on C++

* add op GGML_OP_MUL_MAT_ID for Q4_0_N_M with runtime repack

* added/corrected control on tensor size for Q4 repacking.

* Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

Co-authored-by: Georgi Gerganov <redacted>
* Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

Co-authored-by: Georgi Gerganov <redacted>
* add debug logs on repacks.

---------

Co-authored-by: Georgi Gerganov <redacted>
23 files changed:
ggml/include/ggml-cpu.h
ggml/include/ggml.h
ggml/src/CMakeLists.txt
ggml/src/ggml-cann/ggml-cann.cpp
ggml/src/ggml-common.h
ggml/src/ggml-cpu/CMakeLists.txt
ggml/src/ggml-cpu/amx/amx.cpp
ggml/src/ggml-cpu/amx/amx.h
ggml/src/ggml-cpu/amx/common.h
ggml/src/ggml-cpu/amx/mmq.cpp
ggml/src/ggml-cpu/amx/mmq.h
ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-aarch64.h
ggml/src/ggml-cpu/ggml-cpu-hbm.cpp [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-hbm.h [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-traits.cpp [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-traits.h [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu.c
ggml/src/ggml-cpu/ggml-cpu.cpp
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-quants.c
ggml/src/ggml-sycl/ggml-sycl.cpp
ggml/src/ggml.c