git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

author	Kawrakow <redacted>
	Mon, 5 Feb 2024 08:46:06 +0000 (10:46 +0200)
committer	Georgi Gerganov <redacted>
	Sat, 10 Feb 2024 07:30:58 +0000 (09:30 +0200)
commit	844a39b978ee67630f3499ec2823b487f33e7b3e
tree	3a5dfb286bc911e3bb3d8dd6f492aeb3a7ed0ba1	tree
parent	c589f096a486b5f91a9e0085f573a306a3abe60f	commit \| diff

iq2_xxs: tune quantization (llama/5320)

We get slightly better PPL, and we cut quantization time in
nearly half.

The trick is to 1st quantize without forcing points onto the E8-lattice.
We can then use a narrower search range around the block scale that we
got that way.

Co-authored-by: Iwan Kawrakow <redacted>

src/ggml-quants.c

diff | blob | history

Packaging of ggml-org/ggml

RSS Atom