]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : pad KV cache size (#4280)
authorGeorgi Gerganov <redacted>
Sun, 3 Dec 2023 08:58:16 +0000 (10:58 +0200)
committerGitHub <redacted>
Sun, 3 Dec 2023 08:58:16 +0000 (10:58 +0200)
commitd7b800b8bc490a221acbd83c575206a907f2f6e2
treec41c5d8ead5fb3cb23ea0b5bca51f92a58e0d7cf
parent5a7d3125e7c24f223659b7f0b7aa7736986e92c0
llama : pad KV cache size (#4280)

* llama : pad KV cache size to 32

* metal : try to improve batched decoding
ggml-metal.m
llama.cpp