]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: don't convert BF16 weights to FP32 (ggml/1174)
authorSigbjørn Skjæret <redacted>
Fri, 4 Apr 2025 19:05:12 +0000 (21:05 +0200)
committerGeorgi Gerganov <redacted>
Thu, 24 Apr 2025 17:39:16 +0000 (20:39 +0300)
commit06ce8f83e6026efe2349c25cf8525496ef42b129
tree115f8c46840b3bcf09d44c74af4421f7a0715146
parent8b92060a10a89cd3e8ec6b4bb22cdc1af67c5667
CUDA: don't convert BF16 weights to FP32 (ggml/1174)

* add bf16 support

* use convert_from_bf16_cuda instead of convert_unary_cuda for f32

* revert 7ec5085

* move functionality into convert_unary with constexpr
ggml/src/ggml-cuda/convert.cu
ggml/src/ggml-cuda/convert.cuh
ggml/src/ggml-cuda/ggml-cuda.cu