]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (llama/12858)
authorAlberto Cabrera Pérez <redacted>
Fri, 9 May 2025 15:34:08 +0000 (16:34 +0100)
committerGeorgi Gerganov <redacted>
Tue, 13 May 2025 10:59:21 +0000 (13:59 +0300)
commit45d8b2352e7decfd814d36fb94d7947181ff9ca5
treeea6e63e806da5876f23475d4911bdbdf056cd21e
parent2d436bfbfbf17e55848b8d9ef458169262cc939e
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (llama/12858)

* sycl : Implemented reorder Q4_0 mmvq

Signed-off-by: Alberto Cabrera <redacted>
* sycl : Fixed mmvq being called when reorder is disabled

* sycl : Improved comments in the quants header

Signed-off-by: Alberto Cabrera <redacted>
* Use static_assert

* safe_div -> ceil_div

* Clarify qi comment

* change the reorder tensor from init to execute OP

* dbg

* Undo changes to test-backend-ops

* Refactor changes on top of q4_0 reorder fix

* Missing Reverts

* Refactored opt_for_reorder logic to simplify code path

* Explicit inlining and unroll

* Renamed mul_mat_algo enum for consistency

---------

Signed-off-by: Alberto Cabrera <redacted>
Co-authored-by: romain.biessy <redacted>
ggml/src/ggml-sycl/backend.hpp
ggml/src/ggml-sycl/common.hpp
ggml/src/ggml-sycl/ggml-sycl.cpp
ggml/src/ggml-sycl/mmvq.cpp
ggml/src/ggml-sycl/quants.hpp [new file with mode: 0644]
ggml/src/ggml-sycl/vecdotq.hpp