]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412)
authorslaren <redacted>
Sat, 30 Sep 2023 16:12:57 +0000 (18:12 +0200)
committerGitHub <redacted>
Sat, 30 Sep 2023 16:12:57 +0000 (18:12 +0200)
commitf5ef5cfb18148131fcf45bdd2331f0db5ab7c3d0
tree97465215d07603cfca34daf8adf8280078e0bf5e
parent40e07a60f9ce06e79f3ccd4c903eba300fb31b5e
ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412)

* ggml-cuda : perform cublas matrix multiplication of quantized types as fp16

* rename CC_TURING to CC_VOLTA

* disable fp16 mat mul completely with multi GPU
ggml-cuda.cu