]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Add LLaDA 8b Diffusion model (#14771)
authorAman Gupta <redacted>
Thu, 31 Jul 2025 11:49:09 +0000 (19:49 +0800)
committerGitHub <redacted>
Thu, 31 Jul 2025 11:49:09 +0000 (19:49 +0800)
commit8a4a85627702b569d7d2810f2de06a4321656e9d
treeb538eb466adbcb030d63483fde2739050ee91b8a
parent11490b36723d511d75fb601995c79b5c363ba3a2
Add LLaDA 8b Diffusion model (#14771)

* Add support for Llada-8b: diffusion model

* Add README

* Fix README and convert_hf_to_gguf

* convert_hf_to_gguf.py: address review comments

* Make everything in a single example

* Remove model-specific sampling

* Remove unused argmax

* Remove braced initializers, improve README.md a bit

* Add diffusion specific gguf params in set_vocab, remove setting rope_theta and rms_norm_eps

* Remove adding the mask token

* Move add_add_bos_token to set_vocab

* use add_bool in gguf_writer.py
12 files changed:
common/arg.cpp
common/common.h
convert_hf_to_gguf.py
examples/diffusion/README.md [new file with mode: 0644]
examples/diffusion/diffusion-cli.cpp
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
gguf-py/gguf/tensor_mapping.py
include/llama.h
src/llama-arch.cpp
src/llama-arch.h
src/llama-model.cpp