]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (llama/12858)
authorAlberto Cabrera Pérez <redacted>
Fri, 9 May 2025 15:34:08 +0000 (16:34 +0100)
committerGeorgi Gerganov <redacted>
Tue, 13 May 2025 10:02:19 +0000 (13:02 +0300)
commitb767cce0efdf360d9cbe9dfe6cda52c98685f2fd
tree5dad77aa4f21e41c38e6034f80dd714a5f320122
parent878a093fe0cec5ed25abe80281b957fcf03ff6eb
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (llama/12858)

* sycl : Implemented reorder Q4_0 mmvq

Signed-off-by: Alberto Cabrera <redacted>
* sycl : Fixed mmvq being called when reorder is disabled

* sycl : Improved comments in the quants header

Signed-off-by: Alberto Cabrera <redacted>
* Use static_assert

* safe_div -> ceil_div

* Clarify qi comment

* change the reorder tensor from init to execute OP

* dbg

* Undo changes to test-backend-ops

* Refactor changes on top of q4_0 reorder fix

* Missing Reverts

* Refactored opt_for_reorder logic to simplify code path

* Explicit inlining and unroll

* Renamed mul_mat_algo enum for consistency

---------

Signed-off-by: Alberto Cabrera <redacted>
Co-authored-by: romain.biessy <redacted>
src/ggml-sycl/backend.hpp
src/ggml-sycl/common.hpp
src/ggml-sycl/ggml-sycl.cpp
src/ggml-sycl/mmvq.cpp
src/ggml-sycl/quants.hpp [new file with mode: 0644]
src/ggml-sycl/vecdotq.hpp