]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: don't convert BF16 weights to FP32 (#1174)
authorSigbjørn Skjæret <redacted>
Fri, 4 Apr 2025 19:05:12 +0000 (21:05 +0200)
committerGitHub <redacted>
Fri, 4 Apr 2025 19:05:12 +0000 (21:05 +0200)
commitab9ed73d40965d7e4b25a4adf2230b9a19bffbf9
treef99a0b6f769c7eda1389f50c76778e68fde724ab
parentdddef738b2d5a95323188ed019877d4e20568b7e
CUDA: don't convert BF16 weights to FP32 (#1174)

* add bf16 support

* use convert_from_bf16_cuda instead of convert_unary_cuda for f32

* revert 7ec5085

* move functionality into convert_unary with constexpr
src/ggml-cuda/convert.cu
src/ggml-cuda/convert.cuh
src/ggml-cuda/ggml-cuda.cu