git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Douglas Hanley <redacted>
	Wed, 28 Feb 2024 08:51:11 +0000 (02:51 -0600)
committer	GitHub <redacted>
	Wed, 28 Feb 2024 08:51:11 +0000 (10:51 +0200)
commit	177628bfd85565070916ad66a5ac4071ee0527d8
tree	1532ad96e287a0d8bff4aef92bf2e04eabecec9e	tree
parent	6c4416868df2e5455da7d20547f62bcf9735ba8e	commit \| diff

llama : improve BERT tokenization (#5740)

* implement nfd for stripping accents in wpm tokenizer

* sort nfd map; reuse iterator

* use builtin tolower

* add locale include

* Simplify to_lower cases

Co-authored-by: Jared Van Bortel <redacted>
---------

Co-authored-by: Jared Van Bortel <redacted>

llama.cpp		diff \| blob \| history
unicode.h		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom