]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama/15131)
authorJohannes Gäßler <redacted>
Thu, 7 Aug 2025 08:53:21 +0000 (10:53 +0200)
committerGeorgi Gerganov <redacted>
Thu, 14 Aug 2025 11:17:28 +0000 (14:17 +0300)
commite61603e8896043ab1088649d51cf5986b3b6fe40
tree8b62a3c23c5e43aad5e09a2da6ba4a7dbffb8fc1
parent4eb04d0ef87537f8f9ed6bca026962d5ede6e06f
CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama/15131)

* CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16
13 files changed:
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn-mma-f16.cuh
src/ggml-cuda/fattn.cu
src/ggml-cuda/ggml-cuda.cu
src/ggml-cuda/mma.cuh
src/ggml-cuda/mmf.cu [new file with mode: 0644]
src/ggml-cuda/mmf.cuh [new file with mode: 0644]
src/ggml-cuda/mmq.cu
src/ggml-cuda/mmq.cuh
src/ggml-cuda/mmvf.cu [new file with mode: 0644]
src/ggml-cuda/mmvf.cuh [new file with mode: 0644]
src/ggml-cuda/vendors/hip.h
src/ggml-cuda/vendors/musa.h