]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llm : support Adept Persimmon 8B (#3410)
authorPhillip Kravtsov <redacted>
Sat, 7 Oct 2023 07:12:43 +0000 (00:12 -0700)
committerGitHub <redacted>
Sat, 7 Oct 2023 07:12:43 +0000 (10:12 +0300)
commit0e797c2fc571b866090f7d60ac7d39d8533593f2
treeb8ff2f66c016b6d714ca68d56c60eeab8b3101ee
parent3a716b4dae545c3db307594fbc509a95d3e21b6e
llm : support Adept Persimmon 8B (#3410)

* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci
convert-persimmon-to-gguf.py [new file with mode: 0644]
ggml-metal.m
ggml-metal.metal
gguf-py/gguf/gguf.py
llama.cpp