git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	compilade <redacted>
	Sun, 3 Mar 2024 08:41:55 +0000 (03:41 -0500)
committer	GitHub <redacted>
	Sun, 3 Mar 2024 08:41:55 +0000 (10:41 +0200)
commit	de9692a7d2db66e29e5cb373c6551acc49145ccd
tree	3680f1b63254f37a704ac96d131669e580bf5865	tree
parent	e6029348e86c3810d4435faee54ba822cb43e2ef	commit \| diff

llama : fix llama_copy_state_data with fragmented KV cache (#5840)

The row size of the saved states was based on kv_self.head while
it should be based on llama_kv_cache_cell_max.

Existing session files should still work.

* llama : fix llama_kv_cache_cell_max inability to return 1

I've also changed its return type to uint32_t,
because this function is always used to set the value of uint32_t variables,
and because the index already has this type.

* llama : fix state size calculation

Some bytes in the state were unaccounted for in llama_get_state_size.
Since the logits reserve so much space, it did not cause problems.