git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

overview / pkg / ggml / sources / whisper.cpp / commit

author	Kawrakow <redacted>
	Mon, 5 Feb 2024 08:46:06 +0000 (10:46 +0200)
committer	Georgi Gerganov <redacted>
	Sat, 10 Feb 2024 07:55:46 +0000 (09:55 +0200)
commit	0ed762d691cb6a211b7af6496b3ebaa70e1b848a
tree	5558d0f1cc14e83ad79aed59ddc4657b0b2c627d	tree
parent	1b5bb7792e9fea541dec1e3430a559f8de28f3c8	commit \| diff

iq2_xxs: tune quantization (llama/5320)

We get slightly better PPL, and we cut quantization time in
nearly half.

The trick is to 1st quantize without forcing points onto the E8-lattice.
We can then use a narrower search range around the block scale that we
got that way.

Co-authored-by: Iwan Kawrakow <redacted>

ggml-quants.c

diff | blob | history

Packaging of ggerganov/whisper.cpp

RSS Atom