]>
git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
model : add support for SmallThinker series (#14898)
* support smallthinker
* support 20b softmax, 4b no sliding window
* new build_moe_ffn_from_probs, and can run 4b
* fix 4b rope bug
* fix python type check
* remove is_moe judge
* remove set_dense_start_swa_pattern function and modify set_swa_pattern function
* trim trailing whitespace
* remove get_vocab_base of SmallThinkerModel in convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <redacted>
* better whitespace
Apply suggestions from code review
Co-authored-by: Sigbjørn Skjæret <redacted>
* use GGML_ASSERT for expert count validation
Co-authored-by: Sigbjørn Skjæret <redacted>
* Improve null pointer check for probs
Co-authored-by: Sigbjørn Skjæret <redacted>
* use template parameter for SWA attention logic
* better whitespace
Co-authored-by: Georgi Gerganov <redacted>
* move the creation of inp_out_ids before the layer loop
* remove redundant judge for probs
---------
Co-authored-by: Sigbjørn Skjæret <redacted>
Co-authored-by: Georgi Gerganov <redacted>