]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : validate special token ids are in range when loading GGUF model (#3635)
authorKerfuffle <redacted>
Sun, 22 Oct 2023 18:14:56 +0000 (12:14 -0600)
committerGitHub <redacted>
Sun, 22 Oct 2023 18:14:56 +0000 (21:14 +0300)
commita5e7dbd6141128bfa3c40a19c2945a181df625d3
tree14cb15291418d4f591d7a58d8239eb02b966b595
parentd3956aea53369455008159cc405ed4c496976692
llama : validate special token ids are in range when loading GGUF model (#3635)

* Add validation for special token ids to llama.cpp

Small optimization for llama_byte_to_token SPM mode

* Fix BPE newline check, only I could break something so simple

* Killll meeeeee

* Account for GGUF_KEY_KEY only setting when the key exists

* Minor code cleanups.

* Fix convert.py error msg when added tokens are out of range

* Make gguf SpecialVocab vocab size-aware

Update conversion scripts accordingly

* Avoid a string copy

Co-authored-by: Georgi Gerganov <redacted>
---------

Co-authored-by: Georgi Gerganov <redacted>
convert-baichuan-hf-to-gguf.py
convert-bloom-hf-to-gguf.py
convert-falcon-hf-to-gguf.py
convert-gptneox-hf-to-gguf.py
convert-llama-ggml-to-gguf.py
convert-mpt-hf-to-gguf.py
convert-refact-hf-to-gguf.py
convert-starcoder-hf-to-gguf.py
convert.py
gguf-py/gguf/gguf.py
llama.cpp