git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Georgi Gerganov <redacted>
	Wed, 1 Apr 2026 13:58:01 +0000 (16:58 +0300)
committer	GitHub <redacted>
	Wed, 1 Apr 2026 13:58:01 +0000 (16:58 +0300)
commit	744c0c7310aad90e99a29c5739e4ee317fb6a748
tree	f81ae640cfee838c6ed4d0454993d7d3ee04e689	tree
parent	0356e33aafcc1cb409910244482fac1ec8bafe9f	commit \| diff

llama : rotate activations for better quantization (#21038)

* llama : rotate activations for better quantization

* cont : rotate V more + refactor

* cont : rotate caches separately + support non-power-of-2 head sizes

* cont : simplify

* cont : add reference for V rotation

* cont : refactor

* cont : support context shift

* cont : consolidate

* cont : dedup + allow different types for the rotation matrix

* cont : add env variable to disable rotation

* cont : simplify attn rot kv cache logic + rename env

* cont : pre-compute the Hadamard matrices

src/llama-graph.cpp		diff \| blob \| history
src/llama-graph.h		diff \| blob \| history
src/llama-kv-cache.cpp		diff \| blob \| history
src/llama-kv-cache.h		diff \| blob \| history