]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
update: support Qwen2-57B-A14B (#7835)
authorȘtefan-Gabriel Muscalu <redacted>
Mon, 17 Jun 2024 19:08:46 +0000 (22:08 +0300)
committerGitHub <redacted>
Mon, 17 Jun 2024 19:08:46 +0000 (21:08 +0200)
commita94e6ff8774b7c9f950d9545baf0ce35e8d1ed2f
treeabfa71d6bf6b3743185ead9f9c337c80c49acc04
parent5b6da187508f49a9fa9d95fa22ae804a0780d256
update: support Qwen2-57B-A14B (#7835)

* update: convert-hf-to-gguf.py to support Qwen2-57B-A14B

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shared_exp are now properly calculated

* update: convert-hf-to-gguf.py cleanup for Qwen2MoeForCausalLM

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shexp are now properly calculated
convert-hf-to-gguf.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
llama.cpp