git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Daniel Bevenius <redacted>
	Tue, 26 Aug 2025 14:12:29 +0000 (16:12 +0200)
committer	GitHub <redacted>
	Tue, 26 Aug 2025 14:12:29 +0000 (16:12 +0200)
commit	62cef26ac5b6b7acb635d3dc963813b43952dc2b
tree	98c35310ea73becd75eb98a7aa437eebcc9b1f8d	tree
parent	8f5afa94c4f929da71f560db7c9f38ef6a783d95	commit \| diff

model-conversion : add qat-q4 quantization targets (#15588)

This commit adds two targets to the Makefile for quantizing of
Quantization Aware Trained (QAT) models to Q4_0 format.

The motivation for this is that this sets the token embedding and the
output tensors data types to Q8_0 instead of the default Q6_K. This is
someting that we wish to enforce for QAT Q4_0 models that are to be
uploaded to ggml-org on Huggingface to guarantee the best quality.

examples/model-conversion/Makefile		diff \| blob \| history
examples/model-conversion/README.md		diff \| blob \| history
examples/model-conversion/scripts/utils/quantize.sh		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom