]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Add support for BERT embedding models (#5423)
authorDouglas Hanley <redacted>
Sun, 11 Feb 2024 16:21:38 +0000 (10:21 -0600)
committerGitHub <redacted>
Sun, 11 Feb 2024 16:21:38 +0000 (11:21 -0500)
commit2891c8aa9af17f4ff636ff3868bc34ff72b56e25
tree1a037e8ad635aa54ddf8ab8cb39c04bb4f8cf141
parent97a336507ed9b971d72262bec7e2b8b7016a054a
Add support for BERT embedding models (#5423)

* BERT model graph construction (build_bert)
* WordPiece tokenizer (llm_tokenize_wpm)
* Add flag for non-causal attention models
* Allow for models that only output embeddings
* Support conversion of BERT models to GGUF
* Based on prior work by @xyzhang626 and @skeskinen

---------

Co-authored-by: Jared Van Bortel <redacted>
Co-authored-by: Jared Van Bortel <redacted>
Co-authored-by: Georgi Gerganov <redacted>
.flake8
convert-hf-to-gguf.py
examples/embedding/embedding.cpp
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
gguf-py/gguf/tensor_mapping.py
llama.cpp
llama.h