]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml-cuda: Add NVFP4 dp4a kernel (#20644)
authorMichael Wand <redacted>
Thu, 26 Mar 2026 08:54:03 +0000 (01:54 -0700)
committerGitHub <redacted>
Thu, 26 Mar 2026 08:54:03 +0000 (09:54 +0100)
commit112c78159f917c88ca08f74e67359599c3311829
treeb8492137cd61f5a3eeb6c93346f2d500e6eafad3
parent0fac87b157305eb82a70902327abffbbce25bd3e
ggml-cuda: Add NVFP4 dp4a kernel (#20644)

Added check for dst_t to cuda_cast template for float
Restored ggml_cuda_ue4m3_to_fp32, changed vecdot ints to int32ts
Added CUDART/HIP Check and HIP/fp8 include
Added NVFP4 to Test-backend-ops
Added hip_fp8_e4m3 to __nv_fp8_e4m3 typedef

---------

Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/convert.cu
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/mmvq.cu
ggml/src/ggml-cuda/vecdotq.cuh
ggml/src/ggml-cuda/vendors/cuda.h
ggml/src/ggml-cuda/vendors/hip.h
tests/test-backend-ops.cpp