]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CANN: Add ROPE sin/cos cache for reuse (llama/15912)
authorChenguang Li <redacted>
Wed, 10 Sep 2025 10:42:00 +0000 (18:42 +0800)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:42:53 +0000 (13:42 +0300)
commit4d453b14a99601c5b06b3ec4777c0015c845a602
treea44b067db25dd799496d18bfb3502fce26830179
parent9b773acac0c2597e8fe8318fb16f5f3d5bb5554a
CANN: Add ROPE sin/cos cache for reuse (llama/15912)

* CANN: Add ROPE sin/cos cache for reuse

Introduce sin/cos caching mechanism in ROPE to avoid redundant
computation across layers. The cache is built on the first layer
per device and reused by subsequent layers if parameters match.

- Added sin_cache / cos_cache pointers and position_length tracking
- Introduced cache validity flags and properties:
  (ext_factor, theta_scale, freq_scale, attn_factor, is_neox)
- Accelerates ROPE by eliminating repeated sin/cos generation

This change reduces overhead in multi-layer scenarios while
preserving correctness by verifying parameter consistency.

Co-authored-by: hipudding <redacted>
* fix typo

Signed-off-by: noemotiovon <redacted>
---------

Signed-off-by: noemotiovon <redacted>
Co-authored-by: hipudding <redacted>
ggml/src/ggml-cann/aclnn_ops.cpp
ggml/src/ggml-cann/common.h
ggml/src/ggml-cann/ggml-cann.cpp