git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Ștefan-Gabriel Muscalu <redacted>
	Mon, 17 Jun 2024 19:08:46 +0000 (22:08 +0300)
committer	GitHub <redacted>
	Mon, 17 Jun 2024 19:08:46 +0000 (21:08 +0200)
commit	a94e6ff8774b7c9f950d9545baf0ce35e8d1ed2f
tree	abfa71d6bf6b3743185ead9f9c337c80c49acc04	tree
parent	5b6da187508f49a9fa9d95fa22ae804a0780d256	commit \| diff

update: support Qwen2-57B-A14B (#7835)

* update: convert-hf-to-gguf.py to support Qwen2-57B-A14B

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shared_exp are now properly calculated

* update: convert-hf-to-gguf.py cleanup for Qwen2MoeForCausalLM

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shexp are now properly calculated

convert-hf-to-gguf.py		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/gguf_writer.py		diff \| blob \| history
llama.cpp		diff \| blob \| history