git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Ebey Abraham <redacted>
	Mon, 18 Dec 2023 17:27:47 +0000 (17:27 +0000)
committer	GitHub <redacted>
	Mon, 18 Dec 2023 17:27:47 +0000 (19:27 +0200)
commit	b9e74f9bca5fdf7d0a22ed25e7a9626335fdfa48
tree	b150a0d4490627bfc9cdd758d08d026fc70b0882	tree
parent	3c04bf6da89eaf4c7d317e0518f0687dfcbf2de7	commit \| diff

llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490)

* phi2 implementation

* fix breaking change

* phi-2 : various fixes

* phi-2 : use layer norm eps

* py : whitespaces

* llama : fix meta KV override bug

* convert : phi don't add BOS token

* convert : revert "added_tokens_decoder" change

* phi-2 : scale Q instead of KQ for better precision

* ggml : fix NeoX rope to rotate just first n_dims

* cuda : less diff in the rope_neox kernel

* ggml : add ggml_mul_mat_set_prec

ggml-ci

* Update ggml-cuda.cu

Co-authored-by: slaren <redacted>
* Update ggml-cuda.cu

Co-authored-by: slaren <redacted>
* cuda : ggml_cuda_op_mul_mat_cublas support F32 precision

* cuda : remove oboslete comment

---------

Co-authored-by: Ebey Abraham <redacted>
Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: slaren <redacted>

convert-hf-to-gguf.py		diff \| blob \| history
ggml-cuda.cu		diff \| blob \| history
ggml-metal.metal		diff \| blob \| history
ggml.c		diff \| blob \| history
ggml.h		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/tensor_mapping.py		diff \| blob \| history
llama.cpp		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history