]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
gemma : more consistent attention scaling for v2 and v3 (#13951)
authorGeorgi Gerganov <redacted>
Mon, 2 Jun 2025 17:54:26 +0000 (20:54 +0300)
committerGitHub <redacted>
Mon, 2 Jun 2025 17:54:26 +0000 (20:54 +0300)
commit5582c49c3961269eca96822abfb87528e942dd07
treeb50ca66cf54e21732d88da27f9a91db8375ae62f
parentc9bbc77931d223ed7e7cbcf1cb057bc02fd0db19
gemma : more consistent attention scaling for v2 and v3 (#13951)

* gemma : fix attn scale for 27B

* cont : apply scale before attn

* cont : consistent attention scaling
src/llama-model.cpp