]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
server : context checkpointing for hybrid and recurrent models (#16382)
authorddh0 <redacted>
Fri, 3 Oct 2025 18:34:51 +0000 (13:34 -0500)
committerGitHub <redacted>
Fri, 3 Oct 2025 18:34:51 +0000 (21:34 +0300)
commitf6dcda390004b627ef30af378d0c01ad2519289e
treecedaf6aa7736de34e5b1e96828a0497f52005e9e
parent606a73f53175077429484b23dcf799f69a31d0bd
server : context checkpointing for hybrid and recurrent models (#16382)

* initial commit for branch 3

* generalize `swa_checkpoint` to `ctx_checkpoint`

this extends `llama-server`'s SWA checkpointing logic to include
hybrid/recurrent models such as Jamba, Granite

* oops

* disable debug prints

* keep backwards compat with `--swa-checkpoints`

Co-authored-by: Georgi Gerganov <redacted>
* update prompt re-processing message

* fix off-by-one error per GG

* keep `seq_rm` log per GG

Co-authored-by: Georgi Gerganov <redacted>
* server : fix checkpoint logic to support recurrent caches

* server : cleanup and fixes

---------

Co-authored-by: Georgi Gerganov <redacted>
common/arg.cpp
common/common.h
include/llama.h
src/llama-kv-cache-iswa.cpp
src/llama-memory-hybrid.cpp
src/llama-memory-recurrent.cpp
src/llama-model.cpp
tools/server/server.cpp