]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
hexagon: support for OP_CPY, host buffers now optional, hvx-utils refactoring and...
authorMax Krasnyansky <redacted>
Thu, 15 Jan 2026 05:46:12 +0000 (21:46 -0800)
committerGitHub <redacted>
Thu, 15 Jan 2026 05:46:12 +0000 (21:46 -0800)
commitcff777f22614e3129203ddc93e78b5576c936b0c
treefddeb22b7eefef033f59a064c075ab761023c499
parent36f0132464096e49ed344cdeeee65e39e2b43b14
hexagon: support for OP_CPY, host buffers now optional, hvx-utils refactoring and optimizations   (#18822)

* hexagon: disable repack buffers if host buffers are disabled, improved handling of env vars

* hexagon: add support for OP_CPY fp16/fp32 -> fp16/fp32

Factore out all hvx_copy functions into hvx-copy.h header and reduced code duplication.
Update HTP ops infra to support OP_CPY

* hexagon: cleanup and refactor hex/hvx/htp headers and helper libs

hex is basically all scalar/core platform stuff (L2, DMA, basic utils)
hvx is all hvx related utils, helpers, etc
htp is higher level stuff like Ops, etc

hvx-utils library got a nice round of cleanup and refactoring to reduce duplication

use hvx_vec_store_a where possible

* hexagon: refactor HVX sigmoid functions to hvx-sigmoid.h

Moved sigmoid and tanh vector functions from hvx-utils.h to a new header
hvx-sigmoid.h. Implemented aligned and unaligned variants for sigmoid
array processing using a macro pattern similar to hvx-copy.h. Updated
act-ops.c to use the new aligned variant hvx_sigmoid_f32_aa. Removed
unused hvx-sigmoid.c.

* hexagon: factor out hvx-sqrt.h

* hexagon: mintor update to hvx-utils.h

* hexagon: remove spurios log

* hexagon: factor out and optimize hvx_add/sub/mul

* hexagon: remove _opt variants of add/sub/mul as they simply fully aligned versions

* hexagon: refactor reduction functions to hvx-reduce.h

Moved `hvx_self_max_f32` and `hvx_self_sum_f32` from `hvx-utils.h`/`.c` to `hvx-reduce.h`.
Renamed them to `hvx_reduce_max_f32` and `hvx_reduce_sum_f32`.
Added aligned (`_a`) and unaligned (`_u`) variants and used macros to unify logic.
Updated `softmax-ops.c` to use the new functions.

* hexagon: refactor the rest of arithmetic functions to hvx-arith.h

Moved `hvx_sum_of_squares_f32`, `hvx_min_scalar_f32`, and `hvx_clamp_scalar_f32` from `hvx-utils.c/h` to `hvx-arith.h`. Implemented aligned/unaligned variants (`_aa`, `_au`, etc.) and used macros to reduce code duplication. Updated `hvx_min_scalar_f32` and `hvx_clamp_scalar_f32` to use `dst, src, ..., n` argument order. Updated call sites in `act-ops.c`.

Refactor Hexagon HVX arithmetic functions (min, clamp) to hvx-arith.h

Moved `hvx_min_scalar_f32` and `hvx_clamp_scalar_f32` from `hvx-utils.c/h` to `hvx-arith.h`. Implemented aligned/unaligned variants (`_aa`, `_au`, etc.) and used macros to reduce code duplication. Updated these functions to use `dst, src, ..., n` argument order and updated call sites in `act-ops.c`. `hvx_sum_of_squares_f32` remains in `hvx-utils.c` as requested.

* hexagon: refactor hvx_sum_of_squares_f32

- Modify `hvx_sum_of_squares_f32` in `ggml/src/ggml-hexagon/htp/hvx-reduce.h` to use `dst, src` signature.
- Implement `_a` (aligned) and `_u` (unaligned) variants for `hvx_sum_of_squares_f32`.
- Update `hvx_reduce_loop_body` macro to support both returning and storing results via `finalize_op`.
- Update existing reduction functions in `hvx-reduce.h` to use the updated macro.
- Update `rms_norm_htp_f32` in `ggml/src/ggml-hexagon/htp/unary-ops.c` to match the new signature.

* hexagon: use hvx_splat instead of memset

* hexagon: consistent use of f32/f16 in all function names to match the rest of GGML

* hexagon: fix hvx_copy_f16_f32 on v75 and older

* hexagon: update readme to include GGML_HEXAGON_EXPERIMENTAL

* scripts: update snapdragon/adb scripts to enable host param
48 files changed:
docs/backend/hexagon/README.md
ggml/src/ggml-hexagon/ggml-hexagon.cpp
ggml/src/ggml-hexagon/htp/CMakeLists.txt
ggml/src/ggml-hexagon/htp/act-ops.c
ggml/src/ggml-hexagon/htp/binary-ops.c
ggml/src/ggml-hexagon/htp/cpy-ops.c [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/flash-attn-ops.c
ggml/src/ggml-hexagon/htp/get-rows-ops.c
ggml/src/ggml-hexagon/htp/hex-dma.c [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hex-dma.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hex-dump.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hex-fastdiv.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hex-utils.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/htp-ctx.h
ggml/src/ggml-hexagon/htp/htp-dma.c [deleted file]
ggml/src/ggml-hexagon/htp/htp-dma.h [deleted file]
ggml/src/ggml-hexagon/htp/htp-msg.h
ggml/src/ggml-hexagon/htp/htp-ops.h
ggml/src/ggml-hexagon/htp/hvx-arith.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-base.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-copy.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-dump.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-exp.c [deleted file]
ggml/src/ggml-hexagon/htp/hvx-exp.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-floor.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-inverse.c [deleted file]
ggml/src/ggml-hexagon/htp/hvx-inverse.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-reduce.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-scale.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-sigmoid.c [deleted file]
ggml/src/ggml-hexagon/htp/hvx-sigmoid.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-sqrt.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-types.h [new file with mode: 0644]
ggml/src/ggml-hexagon/htp/hvx-utils.c [deleted file]
ggml/src/ggml-hexagon/htp/hvx-utils.h
ggml/src/ggml-hexagon/htp/main.c
ggml/src/ggml-hexagon/htp/matmul-ops.c
ggml/src/ggml-hexagon/htp/ops-utils.h [deleted file]
ggml/src/ggml-hexagon/htp/rope-ops.c
ggml/src/ggml-hexagon/htp/set-rows-ops.c
ggml/src/ggml-hexagon/htp/softmax-ops.c
ggml/src/ggml-hexagon/htp/unary-ops.c
ggml/src/ggml-hexagon/htp/worker-pool.c
scripts/snapdragon/adb/run-bench.sh
scripts/snapdragon/adb/run-cli.sh
scripts/snapdragon/adb/run-completion.sh
scripts/snapdragon/adb/run-mtmd.sh
scripts/snapdragon/adb/run-tool.sh