]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
model : support Step3.5-Flash (#19283)
authorforforever73 <redacted>
Fri, 6 Feb 2026 20:06:14 +0000 (04:06 +0800)
committerGitHub <redacted>
Fri, 6 Feb 2026 20:06:14 +0000 (21:06 +0100)
commitb83111815e9a79949257e9d4b087206b320a3063
treebf4a72540cf16a01fb5d07c1c8884afaea329f04
parent3228e7728789e0456d0458ce38d20d0b1d60a9aa
model : support Step3.5-Flash (#19283)

* Support Step3.5-Flash

* fix: norm.weight + 1 (HF zero_centered=true)

* step35: simplify GGUF conversion + drop redundant rope KVs

* Address review feedback

* rename limits -> clamp

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <redacted>
* Apply suggestion from @CISC

Co-authored-by: Sigbjørn Skjæret <redacted>
* rename swiglu limits -> swiglu clamp in LLM_KV

* avoid CI fail

* Apply suggestions from code review

* Apply suggestions from code review

* disabled KV shifting for LLM_ARCH_STEP35

* Apply suggestions from code review

* mistakenly removed cmath

* add model size && apply missed suggestion

* assert partial_rotary_factors

* fix CI errors:

* load freq_base_swa

---------

Co-authored-by: lvyichen <redacted>
Co-authored-by: Sigbjørn Skjæret <redacted>
15 files changed:
convert_hf_to_gguf.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
gguf-py/gguf/tensor_mapping.py
src/CMakeLists.txt
src/llama-arch.cpp
src/llama-arch.h
src/llama-graph.cpp
src/llama-hparams.h
src/llama-kv-cache-iswa.cpp
src/llama-kv-cache.cpp
src/llama-model.cpp
src/llama-model.h
src/models/models.h
src/models/step35-iswa.cpp [new file with mode: 0644]