]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CANN: implement LRU cache for ACL graphs (llama/15814)
authorChenguang Li <redacted>
Wed, 10 Sep 2025 07:29:12 +0000 (15:29 +0800)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:42:53 +0000 (13:42 +0300)
commit9b773acac0c2597e8fe8318fb16f5f3d5bb5554a
tree2c010e57e510d435e38c9c59af07b19395d76666
parent7abe187860c0e23ae426bcf301099d945771cbe2
CANN: implement LRU cache for ACL graphs (llama/15814)

* CANN: implement LRU cache for ACL graphs in CANN backend

- Introduce ggml_cann_graph_lru_cache to store multiple ggml_cann_graph objects.
- Graphs are loaded on demand and evicted using LRU policy when capacity is exceeded.
- Updated push, move_to_front, and clear methods to manage cached graphs efficiently.
- Ensures reuse of graphs, reducing graph reconstruction overhead in CANN backend.

* fix typo

* The LRU cache capacity can be configured via an env variable

Signed-off-by: noemotiovon <redacted>
* refactory acl graph

* refactory && fix review comments

Signed-off-by: noemotiovon <redacted>
---------

Signed-off-by: noemotiovon <redacted>
ggml/src/ggml-cann/common.h
ggml/src/ggml-cann/ggml-cann.cpp