git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

llama : add adaptive-p sampler (#17927)

* initial commit for branch

* simplify constants

* add params to `struct common_params_sampling`, add reference to PR

* explicitly clamp `min_target` and `max_target` to `[0.0, 1.0]`

* add args, rename `queue_size` -> `window_size`

* improved comments

* minor

* remove old unused code from algorithm

* minor

* add power law case to `common_sampler_init`, add sampler name mappings

* clarify behaviour when `window_size = 0`

* add missing enums

* remove `target_range` param, make `target == 1` no-op, cleanup code

* oops, straggler

* add missing parameters in `server-task.cpp`

* copy from author

ref:
https://gist.github.com/MrJackSpade/9be99c7efbba7b95a41377e123b7b069

* remove old debug log, style nit

* fix compiler warning, add commented-out logging per token

* re-write + change parameters + simplify

* oops forgot args.cpp

* fix leftover `window_size`

* add missing values to `common_params_sampling::print()`

* with logging

* does this fix it?

* no, but does this?

* update default decay

* optimize

* fix bad merge

my git skills are lacking

* silence `missing initializer for member`

* update default decay to 0.9

* fix logging

* format (double)

* add power law to the new `samplers` vector

* log sampler init values

* improve logging messages in llama_sampler_power_law

* remove extraneous logging

* simplify target computation

last commit with debug logging!

* remove debug logging, explicitly clamp params at init

* add `use_power_law` flag + logic, minor cleanup

* update `power-law` -> `adaptive-p`

* fix cold start EMA

- `ctx->weighted_sum` is now initialized and reset to `target / (1.0f -
clamped_decay)`
- `ctx->total_weight` is now initialized and reset to `1.0f / (1.0f -
clamped_decay)`

this fixes a "cold start" problem with the moving average

* update `SHARPNESS` constant to `10.0f`

* minor style fixes

no functional changes

* minor style fixes cont.

* update `llama_sampler_adaptive_p_i` for backend sampling (ref: #17004)

* separate into `apply` + `accept` functions

* `pending_token_idx`: switch from `llama_token` to `int32`

functionally identical (`llama.h` has `typedef int32_t llama_token;`),
but its more correct now

* don't transform logits <= -1e9f

* fix masking in backend top-p, min-p

* address review comments

* typo in comments `RND` -> `RNG`

* add docs

* add recommended values in completion docs

* address PR feedback

* remove trailing whitespace (for CI `editorconfig`)

* add to adaptive-p to `common_sampler_types_from_chars`

author	ddh0 <redacted>
	Thu, 15 Jan 2026 17:16:29 +0000 (11:16 -0600)
committer	GitHub <redacted>
	Thu, 15 Jan 2026 17:16:29 +0000 (19:16 +0200)
commit	13f1e4a9caa100960cfa42addd0f08d6ef7fb91d
tree	66c8ca73439a6f84d512c7cdc415e490b7429424	tree
parent	a04c2b06a324cc9c7e09de4106597a86eb2421c5	commit \| diff

common/arg.cpp		diff \| blob \| history
common/common.h		diff \| blob \| history
common/sampling.cpp		diff \| blob \| history
include/llama.h		diff \| blob \| history
src/llama-sampling.cpp		diff \| blob \| history
tools/cli/README.md		diff \| blob \| history
tools/completion/README.md		diff \| blob \| history
tools/server/README.md		diff \| blob \| history
tools/server/server-task.cpp		diff \| blob \| history