git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	unbounded <redacted>
	Tue, 25 Apr 2023 17:20:46 +0000 (19:20 +0200)
committer	GitHub <redacted>
	Tue, 25 Apr 2023 17:20:46 +0000 (20:20 +0300)
commit	dd0eabc049fb1efc631cab8eb0a646808d704e18
tree	23a35354481ec346c4501937b95612a19fff9d21	tree
parent	54bb60e26858be251a0eb3cb70f80322aff804a0	commit \| diff

ggml : use full range for Q4_0 and Q4_2 quantization (#729)

* Use full range for q4_0 quantization

By keeping the sign of the highest magnitude, we can make sure the
highest value maps to -8, which is currently unused.
This is a bit of a freebie since it is fully backwards compatible with
the current format.

* Update quantize_row_q4_0 for AVX/AVX2

* Update quantize_row_q4_0 for WASM

Untested

* Update quantize_row_q4_0 for Arm NEON

* Update quantize_row_q4_0 for PowerPC

Untested

* Use full range for q4_2 quantization