]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml : new Q4 and Q5 quantization formats + backward ops
authorGeorgi Gerganov <redacted>
Sun, 14 May 2023 08:23:02 +0000 (11:23 +0300)
committerGeorgi Gerganov <redacted>
Sun, 14 May 2023 12:18:34 +0000 (15:18 +0300)
commitfe48e22fd65ec4e0b3eb15a0809d6f85d1d6dee8
tree198bc0835b5fa35190146fbe0077640ec0bc4418
parenteffcfa62da543e71affe6c39b78d0064f0c5d71d
ggml : new Q4 and Q5 quantization formats + backward ops

sync llama.cpp

- bump GGML_QNT_VERSION -> 1
- increase cwggml object overhead size from 256 to 512 in examples
- drop Q4_2 support
- tensor backend support CUDA
14 files changed:
examples/common-ggml.cpp
examples/dolly-v2/main.cpp
examples/gpt-2/main.cpp
examples/gpt-j/main.cpp
examples/gpt-neox/main.cpp
examples/mnist/main.cpp
examples/starcoder/main.cpp
examples/whisper/quantize.cpp
examples/whisper/whisper.cpp
include/ggml/ggml.h
src/ggml-cuda.cu
src/ggml-cuda.h
src/ggml-opencl.c
src/ggml.c