]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : Add support for DeepSeek V3 (#11049)
authorfairydreaming <redacted>
Sat, 4 Jan 2025 20:06:11 +0000 (21:06 +0100)
committerGitHub <redacted>
Sat, 4 Jan 2025 20:06:11 +0000 (21:06 +0100)
commit9394bbd484f802ce80d2858033583af3ef700d25
treea4bcd1da4d11d3556d7f369f0d864d731445d55d
parentf922a9c542ee117550a168395c63ea79261f5c99
llama : Add support for DeepSeek V3 (#11049)

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <redacted>
16 files changed:
convert_hf_to_gguf.py
convert_hf_to_gguf_update.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
gguf-py/gguf/tensor_mapping.py
include/llama.h
src/llama-arch.cpp
src/llama-arch.h
src/llama-chat.cpp
src/llama-chat.h
src/llama-hparams.h
src/llama-model.cpp
src/llama-model.h
src/llama-vocab.cpp
src/llama.cpp
src/unicode.cpp