git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Jeff Bolz <redacted>
	Fri, 29 Nov 2024 06:18:02 +0000 (00:18 -0600)
committer	GitHub <redacted>
	Fri, 29 Nov 2024 06:18:02 +0000 (07:18 +0100)
commit	f095a649ec390e04dfab1b04e646ae8549dafaef
tree	424c3d46d6844c6d173d3bbd454767d7b61f8514	tree
parent	678d7994f4da0af3d29046be99950ac999ee9762	commit \| diff

vulkan: get the first command buffer submitted sooner (#10499)

This is an incremental improvement over #9118 to get work to the GPU a bit
sooner. The first part is to start with a smaller number of nodes before
the first submit, and ramp it up to the current 100 nodes/submit. The
second part is to reduce the dryrun overhead for all the nodes that just
need to request descriptor space.

With these changes I get around 1-2% speedup on RTX 4070 combined with my
old Haswell-era CPU.

ggml/src/ggml-vulkan/ggml-vulkan.cpp

diff | blob | history

Packaging of ggml-org/llama.cpp

RSS Atom