]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: generalize FP16 fattn vec kernel (llama/7061)
authorJohannes Gäßler <redacted>
Thu, 9 May 2024 12:32:02 +0000 (14:32 +0200)
committerGeorgi Gerganov <redacted>
Sat, 11 May 2024 18:30:08 +0000 (21:30 +0300)
commitbb6ab7168ffbc525a6936860102299990dfb03c1
tree471fb63a2892109249c1499e5667c1c20d61481b
parent9b5e9be833921e20895645424274a213f533f43b
CUDA: generalize FP16 fattn vec kernel (llama/7061)

* CUDA: generalize FP16 fattn vec kernel

* disable unsupported head sizes for AMD in test

* try AMD fix

* fix batch size 2-8

* partially revert changes
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn.cu
tests/test-backend-ops.cpp