]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml-cuda: Add generic NVFP4 MMQ kernel (#21074)
authorMichael Wand <redacted>
Wed, 1 Apr 2026 10:04:58 +0000 (03:04 -0700)
committerGitHub <redacted>
Wed, 1 Apr 2026 10:04:58 +0000 (12:04 +0200)
commit84f82e846caaf871012143a2d658a974edc410c5
tree5b8b7b20968aecb1ee4a797d6376b5dc0e753cf3
parente1cb817483ebda4e3ebc30fd07f4292c654f4339
ggml-cuda: Add generic NVFP4 MMQ kernel (#21074)

* Introduced NVFP4 generic MMQ kernel

* Added extra FP8 guard, hope to solve ci HIP failure

* Rename tiles and use HIP_FP8_AVAILABLE

* Removed remaning FP8 straggler and added const int

* Const

* Removed DECL_MMQ_CASE artifact

* Removed newline

* Removed space after else

* Changed HIP FP8 NVFP4 conversion gate

* Added new line to bottom of mmq.cu 270

* Removed extra spaces

* Removed single space in front of else on line 814

* Added NVFP4 to generate cu script so HIP can see it, further tightened logic

* Include generated mmq-instance-nvfp4.cu

* Added NVFP4 mmq to HIP Check ignore list

* Update ggml/src/ggml-cuda/mmq.cuh

Changed to Q3_K tile to read MMQ_MMA_TILE_X_K_NVFP4

Co-authored-by: Johannes Gäßler <redacted>
* Update ggml/src/ggml-cuda/mmq.cuh

Changed to Q3_K tile to read MMQ_MMA_TILE_X_K_NVFP4 in tile assert

Co-authored-by: Johannes Gäßler <redacted>
* Update ggml/src/ggml-cuda/mmq.cuh

Added function name ending for end if

Co-authored-by: Johannes Gäßler <redacted>
* Added function names to closing endif

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/mmq.cu
ggml/src/ggml-cuda/mmq.cuh
ggml/src/ggml-cuda/template-instances/generate_cu_files.py
ggml/src/ggml-cuda/template-instances/mmq-instance-nvfp4.cu [new file with mode: 0644]
scripts/hip/gcn-cdna-vgpr-check.py