]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (llama/12315)
authoruvos <redacted>
Wed, 12 Mar 2025 09:14:11 +0000 (10:14 +0100)
committerGeorgi Gerganov <redacted>
Thu, 27 Mar 2025 09:06:03 +0000 (11:06 +0200)
commit96ab3b2465e8dfacc2fb77b5fa5cda16b824a245
tree8249012b4f48fc3e89b7aaf1434143ff527a7a3e
parent08f32992d0508f664348786e1f256d94a699dbf8
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (llama/12315)

When fattn-wmma was ported over to warp64 various bits that also touch fattn-vec where converted to
selectable warp size, however the fattn-vec kernels dont work with 64 wide warps for now, so we need
to avoid launching them with parameters for warp64
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-wmma-f16.cu