git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Andrei <redacted>
	Sun, 30 Jun 2024 03:44:08 +0000 (20:44 -0700)
committer	GitHub <redacted>
	Sun, 30 Jun 2024 03:44:08 +0000 (23:44 -0400)
commit	1c5eba6f8e628fb0a98afb27d8aaeb3b0e136451
tree	c681e5cd5c59b58a435684263a26549394c99c7e	tree
parent	72272b83a3878e91251218c981b4c6ec16c33912	commit \| diff

llama: Add attention and final logit soft-capping, update scaling factor to Gemma2 (#8197)

* Add attention and final logit softcapping.

* fix

* Add custom add_ functions

* Disable flash attention for Gemma2

* Update src/llama.cpp

Co-authored-by: slaren <redacted>
* Add default value for attention and final logit softcap value

* Add custom kq scaling from Gemma2Attention

* Remove custom pre attention scaling and use computed value instead.

---------

Co-authored-by: slaren <redacted>

convert-hf-to-gguf.py		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/gguf_writer.py		diff \| blob \| history
src/llama.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom