]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : rework embeddings logic (#14208)
authorGeorgi Gerganov <redacted>
Mon, 16 Jun 2025 11:14:00 +0000 (14:14 +0300)
committerGitHub <redacted>
Mon, 16 Jun 2025 11:14:00 +0000 (14:14 +0300)
commitd3e64b9f490cee41b7b9aa275dae2f6568ae3054
tree8325e6e1057ae709f5e4aa43b89aca1dc279ed63
parent3ba0d843c6bd3faea5cf5e53dc7f3c82be20bffb
llama : rework embeddings logic (#14208)

* llama : rework embeddings logic

ggml-ci

* cont : fix rerank

ggml-ci

* cont : engrish [no ci]

* cont : fix rerank

ggml-ci

* server : support both embeddings and completions with single model

ggml-ci

* cont : avoid embeddings_org

ggml-ci
16 files changed:
common/arg.cpp
common/common.cpp
common/common.h
examples/gritlm/gritlm.cpp
include/llama.h
src/llama-batch.cpp
src/llama-batch.h
src/llama-context.cpp
src/llama-kv-cache-recurrent.cpp
src/llama-kv-cache-recurrent.h
src/llama-kv-cache-unified-iswa.cpp
src/llama-kv-cache-unified-iswa.h
src/llama-kv-cache-unified.cpp
src/llama-kv-cache-unified.h
src/llama-memory.h
tools/server/server.cpp