git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Phillip Kravtsov <redacted>
	Sat, 7 Oct 2023 07:12:43 +0000 (00:12 -0700)
committer	GitHub <redacted>
	Sat, 7 Oct 2023 07:12:43 +0000 (10:12 +0300)
commit	0e797c2fc571b866090f7d60ac7d39d8533593f2
tree	b8ff2f66c016b6d714ca68d56c60eeab8b3101ee	tree
parent	3a716b4dae545c3db307594fbc509a95d3e21b6e	commit \| diff

llm : support Adept Persimmon 8B (#3410)

* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci

convert-persimmon-to-gguf.py	[new file with mode: 0644]	blob
ggml-metal.m		diff \| blob \| history
ggml-metal.metal		diff \| blob \| history
gguf-py/gguf/gguf.py		diff \| blob \| history
llama.cpp		diff \| blob \| history