]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041)
authorKerfuffle <redacted>
Mon, 13 Nov 2023 08:58:15 +0000 (01:58 -0700)
committerGitHub <redacted>
Mon, 13 Nov 2023 08:58:15 +0000 (01:58 -0700)
commitbb50a792ec2a49944470c82694fa364345e95170
tree1ad53a7f00d4cc76a91943a51729806db16988db
parent21fd874c8d2a14dea2d56724e4357c0824aee6a8
Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041)

* Add ReLU and SQR CUDA ops to fix Persimmon offloading

* Persimmon loader: More helpful error on CUDA/ROCM when offloading too many layers
ggml-cuda.cu
llama.cpp