]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : Support llama 4 text-only (#12791)
authorXuan-Son Nguyen <redacted>
Mon, 7 Apr 2025 21:06:44 +0000 (23:06 +0200)
committerGitHub <redacted>
Mon, 7 Apr 2025 21:06:44 +0000 (23:06 +0200)
commit1466621e738779eefe1bb672e17dc55d63d166bb
tree414f66ff30d3c00e121c6db75dd7e009480b0dca
parent82974011f312057b446c27267105bd7ad3810599
llama : Support llama 4 text-only (#12791)

* llama4 conversion

* initial support, no chat template

* clean up a bit

* fix tokenizer conversion

* correct hparams

* try this

* fix shexp

* ffn_inp_normed

* chat template

* clean up model conversion

* add_bos

* add scale_before_ffn

* fix order

* weight_before_ffn

* llm_graph_input_attn_temp

* add chunk attn mask

* build_inp_attn_scale()

* add comment about ggml_repeat

* clarify comments

* fix build
17 files changed:
convert_hf_to_gguf.py
convert_hf_to_gguf_update.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
include/llama.h
models/ggml-vocab-llama4.gguf.inp [new file with mode: 0644]
models/ggml-vocab-llama4.gguf.out [new file with mode: 0644]
src/llama-arch.cpp
src/llama-arch.h
src/llama-chat.cpp
src/llama-chat.h
src/llama-graph.cpp
src/llama-graph.h
src/llama-hparams.h
src/llama-model.cpp
src/llama-model.h
src/llama-vocab.cpp