]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
memory : remove KV cache size padding (#16812)
authorGeorgi Gerganov <redacted>
Tue, 28 Oct 2025 18:19:44 +0000 (20:19 +0200)
committerGitHub <redacted>
Tue, 28 Oct 2025 18:19:44 +0000 (20:19 +0200)
commit85a7d8677bf2200981e52f744a21d5267964ffcf
tree8178fb226ade66e4c296567e9210c31c8452f3c3
parenta8ca18b4b815a2abdbecb958ee5f4c542d69aac7
memory : remove KV cache size padding (#16812)

* memory : remove KV cache size padding

* cont : restore padding for n_kv tensor shape

* server : use slot context size instead of training context size

* server : simplify context limit logic
src/llama-kv-cache.cpp
src/llama-kv-cache.h
src/llama-model.cpp
src/llama-model.h
tools/server/server.cpp
tools/server/tests/unit/test_ctx_shift.py