]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
hexagon: add neg, exp, sigmoid, softplus ops, cont, repeat ops (llama/20701)
authorKrishna Sridhar <redacted>
Tue, 17 Mar 2026 22:34:36 +0000 (15:34 -0700)
committerGeorgi Gerganov <redacted>
Sat, 28 Mar 2026 11:39:09 +0000 (13:39 +0200)
commit8864b2f63b29609017a8d404c2bbab2c240222a7
tree45ee2393226ef43a7535563cf6fe745c0f93a4b5
parentdcd1429d83d317dd65dc38b5e4686eafbca57291
hexagon: add neg, exp, sigmoid, softplus ops, cont, repeat ops (llama/20701)

Add element-wise unary ops needed by Qwen 3.5's DeltaNet linear
attention layers. These ops follow the existing unary-ops pattern
with VTCM DMA double-buffering.

- neg: negate via scale by -1.0
- exp: uses existing hvx_exp_f32 HVX intrinsics
- sigmoid: uses existing hvx_sigmoid_f32_aa HVX intrinsics
- softplus: log(1 + exp(x)) scalar fallback
- CONT reuses the existing CPY infrastructure since making a tensor
  contiguous is equivalent to a same-type copy.
- REPEAT implements tiled memory copy with multi-threaded execution via
  the worker pool, supporting f32 and f16 types. The kernel parallelizes
  across output rows and uses memcpy for each tile.

Co-authored-by: Max Krasnyansky <redacted>
src/ggml-hexagon/ggml-hexagon.cpp
src/ggml-hexagon/htp/CMakeLists.txt
src/ggml-hexagon/htp/htp-msg.h
src/ggml-hexagon/htp/htp-ops.h
src/ggml-hexagon/htp/hvx-base.h
src/ggml-hexagon/htp/hvx-exp.h
src/ggml-hexagon/htp/hvx-sigmoid.h
src/ggml-hexagon/htp/main.c
src/ggml-hexagon/htp/repeat-ops.c [new file with mode: 0644]
src/ggml-hexagon/htp/softmax-ops.c
src/ggml-hexagon/htp/unary-ops.c