]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : support attention bias on LLaMA architecture (#4283)
authorCausalLM <redacted>
Fri, 1 Dec 2023 18:17:06 +0000 (02:17 +0800)
committerGitHub <redacted>
Fri, 1 Dec 2023 18:17:06 +0000 (20:17 +0200)
commit03562f3a86d6706eea9f4fc09b532946c191b34e
tree709378616d9e23c4fb098dc61c7659b32e8740a4
parent37c746d687d877bc11803e96b4dc5f378b83c0a0
llama : support attention bias on LLaMA architecture (#4283)

* Support attention_bias on LLaMA architecture

QKVO bias, should fix InternLM (https://github.com/ggerganov/llama.cpp/issues/3133) and works for LLaMAfied Qwen models (https://github.com/ggerganov/llama.cpp/pull/3743#issuecomment-1825923608).

* check existence of qkvo bias while loading llama models

Tested on LLaMA2, CUDA and CPU.

* Update llama.cpp
llama.cpp