git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Alfred <redacted>
	Fri, 19 Dec 2025 17:42:28 +0000 (12:42 -0500)
committer	GitHub <redacted>
	Fri, 19 Dec 2025 17:42:28 +0000 (09:42 -0800)
commit	ce734a8a2f9fb6eb4f0383ab1370a1b0014ab787
tree	ae0ec7dee029943ef6cfb3257e1bf40f08d7b1d7	tree
parent	14931a826e4f5b4536f03f65c8d568d99bf64f0e	commit \| diff

ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations (#17977)

* feat: implement real Q8_0

* feat: adding cmake option for configuring FP32 quantize group size

* typo: set() shall be used

---------

Co-authored-by: ngdxzy <redacted>

Packaging of ggml-org/llama.cpp

RSS Atom

docs/backend/hexagon/CMakeUserPresets.json		diff \| blob \| history
ggml/CMakeLists.txt		diff \| blob \| history
ggml/src/ggml-hexagon/CMakeLists.txt		diff \| blob \| history
ggml/src/ggml-hexagon/htp/CMakeLists.txt		diff \| blob \| history
ggml/src/ggml-hexagon/htp/matmul-ops.c		diff \| blob \| history