]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: generalize FP16 fattn vec kernel (#7061)
authorJohannes Gäßler <redacted>
Thu, 9 May 2024 12:32:02 +0000 (14:32 +0200)
committerGitHub <redacted>
Thu, 9 May 2024 12:32:02 +0000 (14:32 +0200)
commita743d76a01f23038b2c85af1e9048ee836767b44
tree8182fc85cb9fd055bc9c8268d5d4a05bcf87f57a
parentf31ec120bc36c6270e4948e6a065a7c4cfa0c404
CUDA: generalize FP16 fattn vec kernel (#7061)

* CUDA: generalize FP16 fattn vec kernel

* disable unsupported head sizes for AMD in test

* try AMD fix

* fix batch size 2-8

* partially revert changes
ggml-cuda/common.cuh
ggml-cuda/fattn.cu
llama.cpp
tests/test-backend-ops.cpp