]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
tokenizer : special token handling (#3538)
authorstaviq <redacted>
Tue, 17 Oct 2023 15:11:01 +0000 (17:11 +0200)
committerGitHub <redacted>
Tue, 17 Oct 2023 15:11:01 +0000 (18:11 +0300)
commit1a159553f921a9209fed8c714494e57b3649f232
treeb880614b6be6541d1890db725a7292fccef93855
parent281ef73c258cc1eebec8a64264240432d5878c4b
tokenizer : special token handling (#3538)

* Rewrite special token handling from #1931

* shorten param name, add st verification by type

* use offsets instead of copy by substr

* formatting, remove copying iterator on delete

* llama : normalize code-style

* swift fix

* print pfx/sfx if verb, main: split pfx input sfx

* dont add space when using special tokens

* minor : comment + spacing

---------

Co-authored-by: Georgi Gerganov <redacted>
common/common.cpp
common/common.h
common/train.cpp
examples/batched.swift/Sources/main.swift
examples/main/main.cpp
llama.cpp
llama.h