]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: add softmax broadcast (llama/14475)
authorAman Gupta <redacted>
Wed, 2 Jul 2025 12:34:24 +0000 (20:34 +0800)
committerGeorgi Gerganov <redacted>
Sat, 12 Jul 2025 13:05:00 +0000 (16:05 +0300)
commite689b63cccf05d4a8d1f77914c5007a36efb023f
tree9f03b92a3da3e94243a4b7da11c17c094ba7cc9f
parenta2bacc13e578a4a5aa6fa3c11312d513ca8d8d16
CUDA: add softmax broadcast (llama/14475)

* CUDA: add softmax broadcast

* Pass by const ref

* Review: Use blockDims for indexing, remove designated initializers

* Add TODO for noncontigous input/output
src/ggml-cuda/ggml-cuda.cu
src/ggml-cuda/softmax.cu