]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed...
authorAlfred <redacted>
Fri, 19 Dec 2025 17:42:28 +0000 (12:42 -0500)
committerGeorgi Gerganov <redacted>
Wed, 31 Dec 2025 10:39:43 +0000 (12:39 +0200)
commit1c915c86aa2124a8f042a49e8aa8c4a8ceb48a4a
tree3b0161463ed3ac7a041e1553d0bfdfa8fe457a5d
parent8faf01f18dd38daec7750613f95f82d574470133
ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations (llama/17977)

* feat: implement real Q8_0

* feat: adding cmake option for configuring FP32 quantize group size

* typo: set() shall be used

---------

Co-authored-by: ngdxzy <redacted>
CMakeLists.txt
src/ggml-hexagon/CMakeLists.txt
src/ggml-hexagon/htp/CMakeLists.txt
src/ggml-hexagon/htp/matmul-ops.c