]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
convert : fix encoding of WPM vocab for BERT models (#18500)
authoro7si <redacted>
Thu, 1 Jan 2026 17:27:07 +0000 (01:27 +0800)
committerGitHub <redacted>
Thu, 1 Jan 2026 17:27:07 +0000 (18:27 +0100)
commit2b2afade9fb38b8d699ed561d20a259561c00fc3
tree94936a7af6081899d5ba264544300e0eddd86142
parentf4f501925418ae38bb1e2d8c5e054c436686a782
convert : fix encoding of WPM vocab for BERT models (#18500)

* convert: avoid token collision when stripping ## prefix

* convert: use token types for BERT special tokens check

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <redacted>
---------

Co-authored-by: Sigbjørn Skjæret <redacted>
convert_hf_to_gguf.py