]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CANN: Add ROPE sin/cos cache for reuse (#15912)
authorChenguang Li <redacted>
Wed, 10 Sep 2025 10:42:00 +0000 (18:42 +0800)
committerGitHub <redacted>
Wed, 10 Sep 2025 10:42:00 +0000 (18:42 +0800)
commit10d8b2b6b0ac2cae252a80b4daea5da55ab63c2f
tree460190db6fadf17e81f76e6d4711353f8a67cc1b
parent28b5f190ef1dbea5edf82dbc8b4407b721fadd13
CANN: Add ROPE sin/cos cache for reuse (#15912)

* CANN: Add ROPE sin/cos cache for reuse

Introduce sin/cos caching mechanism in ROPE to avoid redundant
computation across layers. The cache is built on the first layer
per device and reused by subsequent layers if parameters match.

- Added sin_cache / cos_cache pointers and position_length tracking
- Introduced cache validity flags and properties:
  (ext_factor, theta_scale, freq_scale, attn_factor, is_neox)
- Accelerates ROPE by eliminating repeated sin/cos generation

This change reduces overhead in multi-layer scenarios while
preserving correctness by verifying parameter consistency.

Co-authored-by: hipudding <redacted>
* fix typo

Signed-off-by: noemotiovon <redacted>
---------

Signed-off-by: noemotiovon <redacted>
Co-authored-by: hipudding <redacted>
ggml/src/ggml-cann/aclnn_ops.cpp
ggml/src/ggml-cann/common.h
ggml/src/ggml-cann/ggml-cann.cpp