]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : lookup word in vocab before doing BPE merges (#7193)
authorHaoxiang Fei <redacted>
Sat, 11 May 2024 08:12:06 +0000 (16:12 +0800)
committerGitHub <redacted>
Sat, 11 May 2024 08:12:06 +0000 (11:12 +0300)
commitf99e1e456eaf69cc38c1982a2693ce41c0f897ef
treef6bb7dd98afdc852fa428c77c53bf8e72fb69b5e
parent5ae3426b0b64672991563d4c28b2018b9f961467
llama : lookup word in vocab before doing BPE merges (#7193)

* fix: llama-3 ignore_merges

* test: add test for llama-3 bpe ignore_merges

* fix: set ignore_merges only for llama-3

* fix: test-tokenizer-1-bpe --ingore-merges detection

* fix: copy to fix fallthrough

* fix: change ignore_merges to bool

* fix: add ignore merges tests to cmake

* llama : alternative merge ignore logic

---------

Co-authored-by: Haoxiang Fei <redacted>
Co-authored-by: Georgi Gerganov <redacted>
llama.cpp
models/ggml-vocab-llama-bpe.gguf.inp
models/ggml-vocab-llama-bpe.gguf.out
tests/CMakeLists.txt
tests/test-tokenizer-1-bpe.cpp