]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Optimize contiguous copies (llama/10254)
authorJeff Bolz <redacted>
Wed, 13 Nov 2024 06:58:57 +0000 (00:58 -0600)
committerGeorgi Gerganov <redacted>
Wed, 13 Nov 2024 17:03:32 +0000 (19:03 +0200)
commit688752ec02743d60309a760f21550607c34e3baf
tree52ed55fe36b1b7c4f6e3411d34e7d7bbb4d31247
parent67a320b69efde90383531018fca7e8c7562b3e59
vulkan: Optimize contiguous copies (llama/10254)

* tests: Fix memory bandwidth calculation for perf tests

Add a flops calculation for flash attention.

Add one GGML_OP_CPY perf test.

* vulkan: Optimize contiguous copies

Add a variant of the copy shader for when the tensors are contiguous. Avoid
the complex addressing calculations, and do four elements per invocation
to hide some other overhead.

Apply similar changes to the scale shader, since scale is always contiguous.

Add a "progress bar" for shader compiles.
13 files changed:
src/ggml-vulkan.cpp
src/vulkan-shaders/clamp.comp
src/vulkan-shaders/contig_copy.comp [new file with mode: 0644]
src/vulkan-shaders/copy.comp
src/vulkan-shaders/cos.comp
src/vulkan-shaders/generic_unary_head.comp
src/vulkan-shaders/pad.comp
src/vulkan-shaders/repeat.comp
src/vulkan-shaders/scale.comp
src/vulkan-shaders/sin.comp
src/vulkan-shaders/square.comp
src/vulkan-shaders/vulkan-shaders-gen.cpp
tests/test-backend-ops.cpp