]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CANN: Add support for async operator submission (llama/12864)
authorhipudding <redacted>
Thu, 17 Apr 2025 12:34:16 +0000 (20:34 +0800)
committerGeorgi Gerganov <redacted>
Thu, 24 Apr 2025 15:36:25 +0000 (18:36 +0300)
commit3269ecd2a5bd4a9d0fbb0792b9bfe7eff595b652
treedae25e874b6cd48c609f6f9c0e721e2ebc6c6627
parent2f3bf4a5c1aff154223e8f14089ade455319b4b0
CANN: Add support for async operator submission (llama/12864)

Submit operators using asynchronous threads to improve performance.

Use the environment variable GGML_CANN_ASYNC_MODE to control whether
asynchronous submission is enabled. It is disabled by default.

Testing shows a 10%–20% performance improvement in scenarios with
small parameter sizes, especially in quantized models.
src/ggml-cann/aclnn_ops.cpp
src/ggml-cann/aclnn_ops.h
src/ggml-cann/common.h
src/ggml-cann/ggml-cann.cpp