]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
spec : add self‑speculative decoding (no draft model required) + refactor (#18471)
authorSascha Rogmann <redacted>
Wed, 28 Jan 2026 17:42:42 +0000 (18:42 +0100)
committerGitHub <redacted>
Wed, 28 Jan 2026 17:42:42 +0000 (19:42 +0200)
commit72d3b1898a9c81152710cc37dd1dfd26764055d9
tree303c36caaf867b3854ec070da2c1aae0b162ca73
parentebf57258702b23098d7bdcbd46008a95b2401075
spec : add self‑speculative decoding (no draft model required) + refactor (#18471)

* server: introduce self-speculative decoding

* server: moved self-call into speculative.cpp

* can_speculate() includes self-speculation

Co-authored-by: Georgi Gerganov <redacted>
* server: can_speculate() tests self-spec

* server: replace can_speculate() with slot.can_speculate()

Co-authored-by: Sigbjørn Skjæret <redacted>
* common: use %zu format specifier for size_t in logging

Co-authored-by: Sigbjørn Skjæret <redacted>
* server: can_speculate() requires a task instance

* common: ngram map, config self-speculative decoding

* common: add enum common_speculative_type

* common: add vector of speculative states

* common: add option --spec-draftless

* server: cleanup (remove slot.batch_spec, rename)

* common: moved self-spec impl to ngram-map

* common: cleanup (use common_speculative_state_draft)

* spec : refactor

* cont : naming

* spec: remove --spec-config

* doc: (draftless) speculative decoding

* common: print performance in spec decoding

* minor : cleanup

* common : better names

* minor : cleanup + fix build

* minor: comments

* CODEOWNERS: add common/ngram-map.* (#18471)

* common : rename speculative.draftless_type -> speculative.type

* ngram-map : fix uninitialized values

* ngram-map : take into account the input can become shorter

* ngram-map : revert len check for now

* arg : change `--spec-draftless` -> `--spec-type`

* spec : add common_speculative_state::accept()

* spec : refactor + add common_speculative_begin()

* spec : fix begin() call with mtmd

* spec : additional refactor + remove common_speculative_params

---------

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Sigbjørn Skjæret <redacted>
19 files changed:
CODEOWNERS
common/CMakeLists.txt
common/arg.cpp
common/common.cpp
common/common.h
common/ngram-cache.cpp
common/ngram-cache.h
common/ngram-map.cpp [new file with mode: 0644]
common/ngram-map.h [new file with mode: 0644]
common/speculative.cpp
common/speculative.h
docs/speculative.md [new file with mode: 0644]
examples/lookup/lookup-create.cpp
examples/lookup/lookup-stats.cpp
examples/lookup/lookup.cpp
examples/speculative-simple/speculative-simple.cpp
examples/speculative/speculative.cpp
tools/server/server-context.cpp
tools/server/server-task.cpp