]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Optimization: Qwen3 next autoregressive pass (#17996)
authorPiotr Wilkin (ilintar) <redacted>
Tue, 16 Dec 2025 10:59:53 +0000 (11:59 +0100)
committerGitHub <redacted>
Tue, 16 Dec 2025 10:59:53 +0000 (11:59 +0100)
commita5251ca11d2317d93a7b6da4217483f4e83beb3d
tree8074f45d8b28b1c0aa8a62030098726c5fbdee7c
parentfb644247de14c616b10deb5e6b17e6f4230f0601
Optimization: Qwen3 next autoregressive pass (#17996)

* It's Qwen3 Next, the lean mean token generation machine!

* Apply patches from thread

* Remove recurrent version, only keep chunked and autoregressive

* Remove unnecessary conts and asserts

* Remove more extra conts and asserts

* Cleanup masking
src/models/models.h
src/models/qwen3next.cpp