]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
Avoid unnecessarily disabling CUDA graphs (llama/7302)
authoragray3 <redacted>
Wed, 15 May 2024 13:44:49 +0000 (14:44 +0100)
committerGeorgi Gerganov <redacted>
Sun, 16 Jun 2024 15:19:48 +0000 (18:19 +0300)
commit8d55ccdb8cafe5a5e9b5b8ed5b6ce0e9a6a642af
treea0f9f5fb0ea4fc07932b57939fd87111c0743e4d
parent37a72cb1703edf9bdc97d3873f7fec72f542edc0
Avoid unnecessarily disabling CUDA graphs (llama/7302)

As discussed in PR #6766, CUDA graphs were being disabled in the presence of long prompts.
This fixes the issue by avoiding the consective update counter from incrementing unnecessarily
for tokens in which cuda graphs are disabled due to batch size > 1.
ggml-cuda.cu