]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CANN: Add ROPE sin/cos cache for reuse (llama/15912)
authorChenguang Li <redacted>
Wed, 10 Sep 2025 10:42:00 +0000 (18:42 +0800)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:33:50 +0000 (13:33 +0300)
commit00323d2072521ad4d289336d4a5145bdd6860bf1
tree21dad48d31de0268d0e257648a2ce7ff06d0617d
parent711433a7e6607b12c0bbff6d108a63e963bb24fc
CANN: Add ROPE sin/cos cache for reuse (llama/15912)

* CANN: Add ROPE sin/cos cache for reuse

Introduce sin/cos caching mechanism in ROPE to avoid redundant
computation across layers. The cache is built on the first layer
per device and reused by subsequent layers if parameters match.

- Added sin_cache / cos_cache pointers and position_length tracking
- Introduced cache validity flags and properties:
  (ext_factor, theta_scale, freq_scale, attn_factor, is_neox)
- Accelerates ROPE by eliminating repeated sin/cos generation

This change reduces overhead in multi-layer scenarios while
preserving correctness by verifying parameter consistency.

Co-authored-by: hipudding <redacted>
* fix typo

Signed-off-by: noemotiovon <redacted>
---------

Signed-off-by: noemotiovon <redacted>
Co-authored-by: hipudding <redacted>
src/ggml-cann/aclnn_ops.cpp
src/ggml-cann/common.h
src/ggml-cann/ggml-cann.cpp