]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : support batched embeddings (#5466)
authorDouglas Hanley <redacted>
Tue, 13 Feb 2024 12:06:58 +0000 (06:06 -0600)
committerGitHub <redacted>
Tue, 13 Feb 2024 12:06:58 +0000 (14:06 +0200)
commit03bf161eb6dea6400ee49c6dc6b69bdcfa9fd3fc
tree49320ac8aca35d2ba8162c2a280924bacbd7e06b
parentad014bba97ef6ef6c3e2f78b2fc463e91ae94579
llama : support batched embeddings (#5466)

* batched embedding: pool outputs by sequence id. updated embedding example

* bring back non-causal attention

* embd : minor improvements

* llama : minor

---------

Co-authored-by: Georgi Gerganov <redacted>
convert-hf-to-gguf.py
examples/embedding/embedding.cpp
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
llama.cpp
llama.h