git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

author	agray3 <redacted>
	Wed, 15 May 2024 13:44:49 +0000 (14:44 +0100)
committer	Georgi Gerganov <redacted>
	Tue, 28 May 2024 11:41:08 +0000 (14:41 +0300)
commit	ed040d5b1daf9b1a3fd06529e70e6ab810bd7fbf
tree	6f9f21a2dd371422e3204819be330cd2bbee8ac5	tree
parent	207773272ede4676956143475a9b80f2fbe2eafb	commit \| diff

Avoid unnecessarily disabling CUDA graphs (llama/7302)

As discussed in PR #6766, CUDA graphs were being disabled in the presence of long prompts.
This fixes the issue by avoiding the consective update counter from incrementing unnecessarily
for tokens in which cuda graphs are disabled due to batch size > 1.

src/ggml-cuda.cu

diff | blob | history

Packaging of ggml-org/ggml

RSS Atom