git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Georgi Gerganov <redacted>
	Tue, 8 Apr 2025 16:54:51 +0000 (19:54 +0300)
committer	GitHub <redacted>
	Tue, 8 Apr 2025 16:54:51 +0000 (19:54 +0300)
commit	a19b5cef16d885c44c635da4a5c97113c1577de8
tree	d03d8f85266c43059d9018ea53e3822998676a66	tree
parent	78a1ba0a4f2bfed5b8b8e312592143d22e531698	commit \| diff

llama : fix FA when KV cache is not used (i.e. embeddings) (#12825)

* ggml : FA supports F32 V

* graph : cast KV to F16 when the KV cache is not used

ggml-ci

* server : add test that exercises embeddings with FA enabled

ggml-ci

Packaging of ggml-org/llama.cpp

RSS Atom

examples/server/tests/unit/test_embedding.py		diff \| blob \| history
examples/server/tests/utils.py		diff \| blob \| history
examples/server_embd.py		diff \| blob \| history
ggml/src/ggml-cpu/ops.cpp		diff \| blob \| history
ggml/src/ggml-metal/ggml-metal.m		diff \| blob \| history
src/llama-graph.cpp		diff \| blob \| history