]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
convert : support mixed-precision ModelOpt models with per-tensor NVFP4/FP8 quantizat...
authorRichard Davison <redacted>
Mon, 16 Mar 2026 08:18:47 +0000 (09:18 +0100)
committerGitHub <redacted>
Mon, 16 Mar 2026 08:18:47 +0000 (09:18 +0100)
commit079e5a45f0b62a1fb4c012a1a38e3f1e821dfb9b
treebdeec41dde1b44ade3641c8d4ae51e1d54471d5e
parentd3936498a3d8f41cdb35d9e7d04a19e704b4fc89
convert : support mixed-precision ModelOpt models with per-tensor NVFP4/FP8 quantization (#20539)

* support mixed-precision ModelOpt models with per-tensor NVFP4/FP8 quantization

* cleanup

* fallback

---------

Co-authored-by: Sigbjørn Skjæret <redacted>
convert_hf_to_gguf.py