]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : fix Gemma-2 Query scaling factors (#8473)
authorGeorgi Gerganov <redacted>
Sun, 14 Jul 2024 11:05:09 +0000 (14:05 +0300)
committerGitHub <redacted>
Sun, 14 Jul 2024 11:05:09 +0000 (14:05 +0300)
commit73cf442e7bc9baca7b3e213b261551812f1676c9
tree1cc35c5276ffb176563d93d9b1ece9c46f31f67f
parente236528e7628a0e59751eee9addf21fc3c33d376
llama : fix Gemma-2 Query scaling factors (#8473)

* 9B - query_pre_attn_scalar = 256 not 224

See https://github.com/google/gemma_pytorch/commit/03e657582d17cb5a8617ebf333c1c16f3694670e

Gemma 9b should use 256 and not 224 (self.config.hidden_size // self.config.num_attention_heads)

* llama : fix Gemma-2 Query scaling factor

ggml-ci

---------

Co-authored-by: Daniel Han <redacted>
convert_hf_to_gguf.py
src/llama.cpp