git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Georgi Gerganov <redacted>
	Sun, 25 Feb 2024 20:12:24 +0000 (22:12 +0200)
committer	GitHub <redacted>
	Sun, 25 Feb 2024 20:12:24 +0000 (22:12 +0200)
commit	bf08e00643fd529f748f0a858fd79f3061e3fa18
tree	0043ee582e83a19c8f1ca6d75d1519038f866e1c	tree
parent	f7625019c51ca437a5840576d92362cfa710e4a2	commit \| diff

llama : refactor k-shift implementation + KV defragmentation (#5691)

* llama : refactor k-shift implementation

ggml-ci

* llama : rename llama_kv_cache_seq_shift to llama_kv_cache_seq_add

* llama : cont k-shift refactoring + normalize type names

ggml-ci

* minor : fix MPI builds

* llama : reuse n_rot from the build context

ggml-ci

* llama : revert enum name changes from this PR

ggml-ci

* llama : update llama_rope_type

* llama : add comment about rope values

* llama : fix build

* passkey : apply kv cache updates explicitly

ggml-ci

* llama : change name to llama_kv_cache_update()

* llama : add llama_kv_cache_seq_pos_max()

* passkey : fix llama_kv_cache_seq_pos_max() usage

* llama : some llama_kv_cell simplifications

* llama : add llama_kv_cache_compress (EXPERIMENTAL)

* llama : add alternative KV cache merging (EXPERIMENTAL)

* llama : add llama_kv_cache_defrag

* llama : comments

* llama : remove llama_kv_cache_compress

will add in a separate PR

ggml-ci

* llama : defragment via non-overlapping moves

* llama : ggml_graph based defrag implementation

ggml-ci

* llama : switch the loop order in build_defrag

* llama : add comments

examples/infill/infill.cpp		diff \| blob \| history
examples/main/main.cpp		diff \| blob \| history
examples/passkey/passkey.cpp		diff \| blob \| history
examples/server/server.cpp		diff \| blob \| history
llama.cpp		diff \| blob \| history
llama.h		diff \| blob \| history