git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Kerfuffle <redacted>
	Sat, 10 Jun 2023 07:59:17 +0000 (01:59 -0600)
committer	GitHub <redacted>
	Sat, 10 Jun 2023 07:59:17 +0000 (10:59 +0300)
commit	4f0154b0bad775ac4651bf73b5c216eb43c45cdc
tree	33a6036c589fd494af7de0cd786e395d4fd3f699	tree
parent	ef3171d16241c18581d4d08374f0b9e396ade6b7	commit \| diff

llama : support requantizing models instead of only allowing quantization from 16/32bit (#1691)

* Add support for quantizing already quantized models

* Threaded dequantizing and f16 to f32 conversion

* Clean up thread blocks with spares calculation a bit

* Use std::runtime_error exceptions.

examples/quantize/quantize.cpp		diff \| blob \| history
llama.cpp		diff \| blob \| history
llama.h		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom