]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
models : optimize qwen3next graph (llama/19375)
authorGeorgi Gerganov <redacted>
Sat, 14 Feb 2026 10:57:36 +0000 (12:57 +0200)
committerGeorgi Gerganov <redacted>
Sun, 15 Feb 2026 19:44:37 +0000 (21:44 +0200)
commit4ac70ce791baedf27cd14f36313b8056a0fe45a8
tree8bd720e57882215124cf8faa2ef46b0f47b12591
parent226e8c041c720e1293b519ce928487c702fff5c7
models : optimize qwen3next graph (llama/19375)

* models : optimizing qwen3next graph

* cont

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* cont : remove redundant q, g chunking

* minor

* minor

* avoid passing masks around

* avoid concats during chunking

* naming + shapes

* update names and use prefix to disable CUDA graphs
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-metal/ggml-metal-common.cpp