]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Unicode codepoint flags for custom regexs (#7245)
authorjaime-m-p <redacted>
Fri, 17 May 2024 23:09:13 +0000 (01:09 +0200)
committerGitHub <redacted>
Fri, 17 May 2024 23:09:13 +0000 (01:09 +0200)
commitb43272afa29a64dcb8bcf26a96a05bac40792b92
tree1d5e893fd96c3f56b62f6e1ca2ba1274e69deca9
parent0fc1e820a9900a3dd08ddd3c6abe6604c53b689b
Unicode codepoint flags for custom regexs (#7245)

* Replace CODEPOINT_TYPE_* with codepoint_flags
* Update and bugfix brute force random test
* Deterministic brute force random test
* Unicode normalization NFD
* Get rid of BOM
llama.cpp
scripts/gen-unicode-data.py
tests/test-tokenizer-random.py
unicode-data.cpp
unicode-data.h
unicode.cpp
unicode.h