git.djapps.eu Git - pkg/ggml/sources/ggml/commit

author	Jeff Bolz <redacted>
	Sun, 29 Jun 2025 07:43:36 +0000 (02:43 -0500)
committer	Georgi Gerganov <redacted>
	Tue, 1 Jul 2025 08:52:14 +0000 (11:52 +0300)
commit	6f5b10218b2c51be00be392d2fd34ee388674517
tree	4feaca85a6b6735c65e46b43e8294ebcd886fbe3	tree
parent	67fc917af3e5b3ec6144f7a27e82deaa889c7078	commit \| diff

vulkan: Add fusion support for RMS_NORM+MUL (llama/14366)

* vulkan: Add fusion support for RMS_NORM+MUL

- Add a use_count to ggml_tensor, so we can detect if an output is used more than once.
- Change the ggml-vulkan rms_norm shader to optionally multiply by another tensor.
- Add detection logic and basic fusion logic in ggml-vulkan.
- Add some testing support for fusion. Rather than computing one node at a time, allow
for computing the whole graph and just testing one node's results. Add rms_norm_mul tests
and enable a llama test.

* extract some common fusion logic

* fix -Winconsistent-missing-override

* move ggml_can_fuse to a common function

* build fix

* C and C++ versions of can_fuse

* move use count to the graph to avoid data races and double increments when used in multiple threads

* use hash table lookup to find node index

* change use_counts to be indexed by hash table slot

* minimize hash lookups

style fixes

* last node doesn't need single use.
fix type.
handle mul operands being swapped.

* remove redundant parameter

---------

Co-authored-by: slaren <redacted>

include/ggml-backend.h		diff \| blob \| history
src/ggml-backend.cpp		diff \| blob \| history
src/ggml-impl.h		diff \| blob \| history
src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
src/ggml-vulkan/vulkan-shaders/rms_norm.comp		diff \| blob \| history
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history
src/ggml.c		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history