git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Xuan-Son Nguyen <redacted>
	Sun, 11 Jan 2026 11:53:33 +0000 (12:53 +0100)
committer	GitHub <redacted>
	Sun, 11 Jan 2026 11:53:33 +0000 (12:53 +0100)
commit	506bb6e01009058f35558474cf987eeb56361782
tree	08e7a99e6dbbc196061ddf4d91d19283fdfc73f2	tree
parent	79456a690ae35cb2a75faf24d3b1926f716b0485	commit \| diff

model: try to improve Qwen3 Next (#18683)

* qwen3next: simplify qkvz projection

* use ggml_swiglu_split

* revert swiglu_split, but remove redundant repeat()

* fix missing reshape

* rm 2 redundant transposes

* move mul_mat(k,q) to outside of chunking

* rm redundant cont

* improve g_cs_chunk

* add comments about no cont

* use std::pair instead of ggml_concat

* vectorize key_gdiff calculation

* rm unused tensor

* avoid ggml_concat inside loop

* bring back ggml_concat as it may not work on other backend

* nits

convert_hf_to_gguf.py		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
src/llama-arch.cpp		diff \| blob \| history
src/llama-model.cpp		diff \| blob \| history
src/models/models.h		diff \| blob \| history
src/models/qwen3next.cpp		diff \| blob \| history