git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Douglas Hanley <redacted>
	Sun, 11 Feb 2024 16:21:38 +0000 (10:21 -0600)
committer	GitHub <redacted>
	Sun, 11 Feb 2024 16:21:38 +0000 (11:21 -0500)
commit	2891c8aa9af17f4ff636ff3868bc34ff72b56e25
tree	1a037e8ad635aa54ddf8ab8cb39c04bb4f8cf141	tree
parent	97a336507ed9b971d72262bec7e2b8b7016a054a	commit \| diff

Add support for BERT embedding models (#5423)

* BERT model graph construction (build_bert)
* WordPiece tokenizer (llm_tokenize_wpm)
* Add flag for non-causal attention models
* Allow for models that only output embeddings
* Support conversion of BERT models to GGUF
* Based on prior work by @xyzhang626 and @skeskinen

---------

Co-authored-by: Jared Van Bortel <redacted>
Co-authored-by: Jared Van Bortel <redacted>
Co-authored-by: Georgi Gerganov <redacted>

.flake8		diff \| blob \| history
convert-hf-to-gguf.py		diff \| blob \| history
examples/embedding/embedding.cpp		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/gguf_writer.py		diff \| blob \| history
gguf-py/gguf/tensor_mapping.py		diff \| blob \| history
llama.cpp		diff \| blob \| history
llama.h		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom