]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360)
authorRémy Oudompheng <redacted>
Wed, 29 Jan 2025 17:29:39 +0000 (18:29 +0100)
committerGitHub <redacted>
Wed, 29 Jan 2025 17:29:39 +0000 (18:29 +0100)
commit66ee4f297cff3c7ce98b31dbc0ce909d41b9e408
treed3019c85b71f808a9ea5211249ccd2ab16998478
parente51c47b401f8cb5f21630a05171e2529cde4d186
vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360)

* vulkan: initial support for IQ3_S

* vulkan: initial support for IQ3_XXS

* vulkan: initial support for IQ2_XXS

* vulkan: initial support for IQ2_XS

* vulkan: optimize Q3_K by removing branches

* vulkan: implement dequantize variants for coopmat2

* vulkan: initial support for IQ2_S

* vulkan: vertically realign code

* port failing dequant callbacks from mul_mm

* Fix array length mismatches

* vulkan: avoid using workgroup size before it is referenced

* tests: increase timeout for Vulkan llvmpipe backend

---------

Co-authored-by: Jeff Bolz <redacted>
19 files changed:
.github/workflows/build.yml
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/copy_from_quant.comp
ggml/src/ggml-vulkan/vulkan-shaders/copy_to_quant.comp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs.comp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.comp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq2_s.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq2_xs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq2_xxs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq3_s.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq3_xxs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq4_nl.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
ggml/src/ggml-vulkan/vulkan-shaders/get_rows_quant.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm_cm2.comp
ggml/src/ggml-vulkan/vulkan-shaders/types.comp
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp