]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: fuse adds, fuse add with rms norm (#15631)
authorAman Gupta <redacted>
Fri, 29 Aug 2025 03:35:58 +0000 (11:35 +0800)
committerGitHub <redacted>
Fri, 29 Aug 2025 03:35:58 +0000 (11:35 +0800)
commit009b709d6efd24820ac67765ed339a72dc797814
tree7f5f4e132634d71960c9729204e6c0e947170409
parente8d99dd0b67f2ecc1e45fca8074a3a18c3e036d2
CUDA: fuse adds, fuse add with rms norm (#15631)

* CUDA: fused add with rms_norm_mul

* Non-broadcast fuse works

* Add fused adds

* format

* Remove n_fuse from template params

* Address review comments

* Move template inside binbcast
ggml/src/ggml-cuda/binbcast.cu
ggml/src/ggml-cuda/binbcast.cuh
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/norm.cu
ggml/src/ggml-cuda/norm.cuh