git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Georgi Gerganov <redacted>
	Thu, 17 Jul 2025 16:08:33 +0000 (19:08 +0300)
committer	GitHub <redacted>
	Thu, 17 Jul 2025 16:08:33 +0000 (19:08 +0300)
commit	01612b74090df592663cfa01f661c9628f403b59
tree	bd4aa95477680043f013e9dbed339e698a79dbc3	tree
parent	086cf81e88fb75287b71ff19c08a206b7bc2e02f	commit \| diff

llama : reuse compute graphs (#14482)

* llama : reuse compute graphs

ggml-ci

* llama-bench : add graph reuse parameter

ggml-ci

* cont : remove the parameter and the sched resets

ggml-ci

* graph : rename update() to can_reuse()

ggml-ci

* params : remove is_same()

ggml-ci

* graph : set res->params in llm_graph_context constructor

ggml-ci

* graph : avoid set_max_nodes in llm_graph_result

ggml-ci

* kv-cache : reuse llama_context's graph result instance

ggml-ci

* context : reset the previous graph result upon memory updates

ggml-ci

* batch : llama_ubatch now carries its data instead of pointing to balloc

ggml-ci

* merge : fix build

ggml-ci

* graph : fix can_reuse() checks when flash-attention is disabled

* graph : move llm_graph_result impl in source file + debug env

ggml-ci

include/llama.h		diff \| blob \| history
src/llama-batch.cpp		diff \| blob \| history
src/llama-batch.h		diff \| blob \| history
src/llama-context.cpp		diff \| blob \| history
src/llama-context.h		diff \| blob \| history
src/llama-graph.cpp		diff \| blob \| history
src/llama-graph.h		diff \| blob \| history
src/llama-kv-cache-unified.cpp		diff \| blob \| history
src/llama-kv-cache-unified.h		diff \| blob \| history
src/llama-memory-recurrent.cpp		diff \| blob \| history
src/llama-model.cpp		diff \| blob \| history
src/llama-model.h		diff \| blob \| history