git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Jeffrey Morgan <redacted>
	Sat, 27 Jul 2024 12:03:45 +0000 (05:03 -0700)
committer	GitHub <redacted>
	Sat, 27 Jul 2024 12:03:45 +0000 (15:03 +0300)
commit	b5e95468b1676e1e5c9d80d1eeeb26f542a38f42
tree	c604fe1f816db070a61f71975f2984f9a373dd47	tree
parent	92090eca212650727e38b335c1d4accfbcc9b79c	commit \| diff

llama : add support for llama 3.1 rope scaling factors (#8676)

* Add llama 3.1 rope scaling factors to llama conversion and inference

This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <redacted>
* address comments

* address comments

* Update src/llama.cpp

Co-authored-by: compilade <redacted>
* Update convert_hf_to_gguf.py

Co-authored-by: compilade <redacted>
---------

Co-authored-by: compilade <redacted>

convert_hf_to_gguf.py		diff \| blob \| history
src/llama.cpp		diff \| blob \| history