git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Jeff Bolz <redacted>
	Wed, 13 Nov 2024 06:58:57 +0000 (00:58 -0600)
committer	GitHub <redacted>
	Wed, 13 Nov 2024 06:58:57 +0000 (07:58 +0100)
commit	80dd7ff22fd050fed58b552cc8001aaf968b7ebf
tree	f134f6626bb07632baa88dfa292d45f5d6ca4aa8	tree
parent	54ef9cfc726a799e6f454ac22c4815d037716eda	commit \| diff

vulkan: Optimize contiguous copies (#10254)

* tests: Fix memory bandwidth calculation for perf tests

Add a flops calculation for flash attention.

Add one GGML_OP_CPY perf test.

* vulkan: Optimize contiguous copies

Add a variant of the copy shader for when the tensors are contiguous. Avoid
the complex addressing calculations, and do four elements per invocation
to hide some other overhead.

Apply similar changes to the scale shader, since scale is always contiguous.

Add a "progress bar" for shader compiles.

ggml/src/ggml-vulkan.cpp		diff \| blob \| history
ggml/src/vulkan-shaders/clamp.comp		diff \| blob \| history
ggml/src/vulkan-shaders/contig_copy.comp	[new file with mode: 0644]	blob
ggml/src/vulkan-shaders/copy.comp		diff \| blob \| history
ggml/src/vulkan-shaders/cos.comp		diff \| blob \| history
ggml/src/vulkan-shaders/generic_unary_head.comp		diff \| blob \| history
ggml/src/vulkan-shaders/pad.comp		diff \| blob \| history
ggml/src/vulkan-shaders/repeat.comp		diff \| blob \| history
ggml/src/vulkan-shaders/scale.comp		diff \| blob \| history
ggml/src/vulkan-shaders/sin.comp		diff \| blob \| history
ggml/src/vulkan-shaders/square.comp		diff \| blob \| history
ggml/src/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history