git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	liuwei-git <redacted>
	Tue, 21 May 2024 20:28:32 +0000 (04:28 +0800)
committer	GitHub <redacted>
	Tue, 21 May 2024 20:28:32 +0000 (23:28 +0300)
commit	201cc11afa0a1950e1f632390b2ac6c937a0d8f0
tree	440fb7ecd80b48772a955a80855db29677d172a2	tree
parent	6369bf04336ab60e5c892dd77a3246df91015147	commit \| diff

llama : add phi3 128K model support (#7225)

* add phi3 128k support in convert-hf-to-gguf

* add phi3 128k support in cuda

* address build warnings on llama.cpp

* adjust index value in cuda long rope freq factors

* add long rope support in ggml cpu backend

* make freq factors only depend on ctx size

* remove unused rope scaling type 'su' frin gguf converter

* fix flint warnings on convert-hf-to-gguf.py

* set to the short freq factor when context size is small than trained context size

* add one line of comments

* metal : support rope freq_factors

* ggml : update ggml_rope_ext API to support freq. factors

* backends : add dev messages to support rope freq. factors

* minor : style

* tests : update to use new rope API

* backends : fix pragma semicolons

* minor : cleanup

* llama : move rope factors from KV header to tensors

* llama : remove tmp assert

* cuda : fix compile warning

* convert : read/write n_head_kv

* llama : fix uninitialized tensors

---------

Co-authored-by: Georgi Gerganov <redacted>

convert-hf-to-gguf.py		diff \| blob \| history
examples/finetune/finetune.cpp		diff \| blob \| history
examples/train-text-from-scratch/train-text-from-scratch.cpp		diff \| blob \| history
ggml-cuda/rope.cu		diff \| blob \| history
ggml-kompute.cpp		diff \| blob \| history
ggml-metal.m		diff \| blob \| history
ggml-metal.metal		diff \| blob \| history
ggml-sycl.cpp		diff \| blob \| history
ggml-vulkan.cpp		diff \| blob \| history
ggml.c		diff \| blob \| history
ggml.h		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/gguf_writer.py		diff \| blob \| history
llama.cpp		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history