]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : support requantizing models instead of only allowing quantization from 16...
authorKerfuffle <redacted>
Sat, 10 Jun 2023 07:59:17 +0000 (01:59 -0600)
committerGitHub <redacted>
Sat, 10 Jun 2023 07:59:17 +0000 (10:59 +0300)
commit4f0154b0bad775ac4651bf73b5c216eb43c45cdc
tree33a6036c589fd494af7de0cd786e395d4fd3f699
parentef3171d16241c18581d4d08374f0b9e396ade6b7
llama : support requantizing models instead of only allowing quantization from 16/32bit (#1691)

* Add support for quantizing already quantized models

* Threaded dequantizing and f16 to f32 conversion

* Clean up thread blocks with spares calculation a bit

* Use std::runtime_error exceptions.
examples/quantize/quantize.cpp
llama.cpp
llama.h