git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Michael Wand <redacted>
	Thu, 26 Mar 2026 15:52:06 +0000 (08:52 -0700)
committer	GitHub <redacted>
	Thu, 26 Mar 2026 15:52:06 +0000 (16:52 +0100)
commit	f8d4abae86740bed849c1d2a664dc4f56e35ff0a
tree	68f8a36beece4d8b8e380c813a31d867cccbeb21	tree
parent	3d5acab3e774c3d30748d1e60093f19f0c80506e	commit \| diff

convert : support Qwen3.5/Qwen3.5 Moe NVFP4 and add input scales (#20505)

* convert : fix Qwen3.5 NVFP4 conversion

* Updated copilot concerns and rebased

* move into _LinearAttentionVReorderBase and simplify

* --flake

* new_name not needed

* Added input_scale to gguf

* Fixed input_scale addition as tensor

* Added input scale to loader and named _in_s

* Update convert_hf_to_gguf.py

Re-removed input_scale from aux cleanup

Co-authored-by: Sigbjørn Skjæret <redacted>
---------

Co-authored-by: Sigbjørn Skjæret <redacted>

convert_hf_to_gguf.py		diff \| blob \| history
src/llama-model.cpp		diff \| blob \| history
src/llama-model.h		diff \| blob \| history