]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
graph : utilize `ggml_build_forward_select()` to avoid reallocations (#18898)
authorGeorgi Gerganov <redacted>
Fri, 23 Jan 2026 16:22:34 +0000 (18:22 +0200)
committerGitHub <redacted>
Fri, 23 Jan 2026 16:22:34 +0000 (18:22 +0200)
commit557515be1e93ed8939dd8a7c7d08765fdbe8be31
tree85ea47a16ae0097d197fc472bed2f2e653601896
parentcb6caca191b9a3a9a4eaa13dd9e465225d127034
graph : utilize `ggml_build_forward_select()` to avoid reallocations (#18898)

* graph : avoid branches between embedding and token inputs

* models : make deepstack graphs (e.g. Qwen3 VL) have constant topology

* ci : enable -DGGML_SCHED_NO_REALLOC=ON for server CI

* cont : pad token embeddings to n_embd_inp
.github/workflows/server.yml
src/llama-context.cpp
src/llama-graph.cpp
src/llama-graph.h
src/models/gemma3n-iswa.cpp
src/models/qwen3vl-moe.cpp
src/models/qwen3vl.cpp