git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Giuseppe Scrivano <redacted>
	Tue, 28 May 2024 18:49:49 +0000 (20:49 +0200)
committer	GitHub <redacted>
	Tue, 28 May 2024 18:49:49 +0000 (21:49 +0300)
commit	5442939fcc5e6ae41abf40612a95fd71377e487e
tree	1402af1bf61b8a110252b748b9d453e09946d5cf	tree
parent	56411a950f255b523a9edd684fd1632752474399	commit \| diff

llama : support small Granite models (#7481)

* Add optional MLP bias for Granite models

Add optional MLP bias for ARCH_LLAMA to support Granite models.
Partially addresses ggerganov/llama.cpp/issues/7116
Still needs some more changes to properly support Granite.

* llama: honor add_space_prefix from the model configuration

propagate the add_space_prefix configuration from the HF model
configuration to the gguf file and honor it with the gpt2 tokenizer.

Signed-off-by: Giuseppe Scrivano <redacted>
* llama: add support for small granite models

it works only for the small models 3b and 8b.

The convert-hf-to-gguf.py script uses the vocabulary size of the
granite models to detect granite and set the correct configuration.

Signed-off-by: Giuseppe Scrivano <redacted>
---------

Signed-off-by: Giuseppe Scrivano <redacted>
Co-authored-by: Steffen Roecker <redacted>

convert-hf-to-gguf.py		diff \| blob \| history
llama.cpp		diff \| blob \| history