git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Mikko Juola <redacted>
	Sun, 15 Jun 2025 07:52:06 +0000 (00:52 -0700)
committer	GitHub <redacted>
	Sun, 15 Jun 2025 07:52:06 +0000 (09:52 +0200)
commit	9ae4143bc6ecb4c2f0f0301578f619f6c201b857
tree	46d6bb1851d3f515508a208286764cd0f47fb39b	tree
parent	c311ac664d68d10781a3e7b9f02d9d9520837d80	commit \| diff

model : add dots.llm1 architecture support (#14044) (#14118)

Adds:

* Dots1Model to convert_hf_to_gguf.py

* Computation graph code to llama-model.cpp

* Chat template to llama-chat.cpp to detect this model's template.

---

The model is called "dots.llm1" (I decided to shorten it to dots1 or
DOTS1 in the code generally) architecture.

The only models that exist as of writing of this commit that follow this
architecture are "dots.llm1.inst" and "dots.llm1.base" from here:

* https://huggingface.co/rednote-hilab/dots.llm1.inst

* https://huggingface.co/rednote-hilab/dots.llm1.base

The model architecture is a combination of Qwen and Deepseek parts, as
seen here:

https://github.com/huggingface/transformers/blob/ffe12627b4e84489d2ab91dd0ec00614855edc79/src/transformers/models/dots1/modular_dots1.py

convert_hf_to_gguf.py		diff \| blob \| history
gguf-py/gguf/constants.py		diff \| blob \| history
gguf-py/gguf/tensor_mapping.py		diff \| blob \| history
src/llama-arch.cpp		diff \| blob \| history
src/llama-arch.h		diff \| blob \| history
src/llama-chat.cpp		diff \| blob \| history
src/llama-chat.h		diff \| blob \| history
src/llama-model.cpp		diff \| blob \| history
src/llama-model.h		diff \| blob \| history