git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Georgi Gerganov <redacted>
	Thu, 23 Nov 2023 17:07:56 +0000 (19:07 +0200)
committer	GitHub <redacted>
	Thu, 23 Nov 2023 17:07:56 +0000 (19:07 +0200)
commit	6b0a7420d03b9d13cb0e9439a01ce8476d8bf093
tree	f184d281cb47e357e4ead4a93a0d1fe504c74bbe	tree
parent	d103d935c0e75769a6a597f7a64cab72c6cc3e79	commit \| diff

llama : KV cache view API + better KV cache management (#4170)

* llama : keep track of used KV cells + better KV cache management

* llama : zero KV cache used upon clear

ggml-ci

* llama : allow exporting a view of the KV cache (#4180)

* Allow exporting a view of the KV cache

* Allow dumping the sequences per cell in common

* Track max contiguous cells value and position as well

* Fix max contiguous empty cells index calculation

Make dump functions deal with lengths or sequences counts > 10 better

* Fix off by one error in dump_kv_cache_view

* Add doc comments for KV cache view functions

Eliminate cell sequence struct; use llama_seq_id directly

Minor cleanups

* common : add -dkvc arg for enabling kv cache dumps

---------

Co-authored-by: Kerfuffle <redacted>

common/common.cpp		diff \| blob \| history
common/common.h		diff \| blob \| history
examples/parallel/parallel.cpp		diff \| blob \| history
llama.cpp		diff \| blob \| history
llama.h		diff \| blob \| history