]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml : refactor online repacking (#10446)
authorDjip007 <redacted>
Sat, 7 Dec 2024 12:37:50 +0000 (13:37 +0100)
committerGitHub <redacted>
Sat, 7 Dec 2024 12:37:50 +0000 (14:37 +0200)
commit19d8762ab61df8286367588a80b9c7db4cb568db
tree89f09ddb7fb28f87e421be41cbd8d3f3c49d207c
parentc2a16c0bdbe2e51adf318918bad82f0c3e3d6f3b
ggml : refactor online repacking (#10446)

* rename ggml-cpu-aarch64.c to .cpp

* reformat extra cpu backend.

- clean Q4_0_N_M and IQ4_0_N_M
  - remove from "file" tensor type
  - allow only with dynamic repack

- extract cpu extra bufts and convert to C++
  - hbm
  - "aarch64"

- more generic use of extra buffer
  - generalise extra_supports_op
  - new API for "cpu-accel":
     - amx
     - aarch64

* clang-format

* Clean Q4_0_N_M ref

Enable restrict on C++

* add op GGML_OP_MUL_MAT_ID for Q4_0_N_M with runtime repack

* added/corrected control on tensor size for Q4 repacking.

* Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

Co-authored-by: Georgi Gerganov <redacted>
* Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp

Co-authored-by: Georgi Gerganov <redacted>
* add debug logs on repacks.

---------

Co-authored-by: Georgi Gerganov <redacted>
34 files changed:
Makefile
Package.swift
docs/build.md
examples/quantize/README.md
examples/quantize/quantize.cpp
ggml/include/ggml-cpu.h
ggml/include/ggml.h
ggml/src/CMakeLists.txt
ggml/src/ggml-aarch64.c [deleted file]
ggml/src/ggml-aarch64.h [deleted file]
ggml/src/ggml-cann/ggml-cann.cpp
ggml/src/ggml-common.h
ggml/src/ggml-cpu/CMakeLists.txt
ggml/src/ggml-cpu/amx/amx.cpp
ggml/src/ggml-cpu/amx/amx.h
ggml/src/ggml-cpu/amx/common.h
ggml/src/ggml-cpu/amx/mmq.cpp
ggml/src/ggml-cpu/amx/mmq.h
ggml/src/ggml-cpu/ggml-cpu-aarch64.c [deleted file]
ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-aarch64.h
ggml/src/ggml-cpu/ggml-cpu-hbm.cpp [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-hbm.h [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-traits.cpp [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu-traits.h [new file with mode: 0644]
ggml/src/ggml-cpu/ggml-cpu.c
ggml/src/ggml-cpu/ggml-cpu.cpp
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-quants.c
ggml/src/ggml-sycl/ggml-sycl.cpp
ggml/src/ggml.c
gguf-py/gguf/constants.py
include/llama.h
src/llama.cpp