]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858)
authorAlberto Cabrera Pérez <redacted>
Fri, 9 May 2025 15:34:08 +0000 (16:34 +0100)
committerGitHub <redacted>
Fri, 9 May 2025 15:34:08 +0000 (16:34 +0100)
commit17512a94d636c4b6c1332370acb3e5af3ca70918
tree46e26b7fe4707b603893a5c75b560e92c71dfbaf
parent611aa914ef4231fab5d1ad04773c42e119ae2d2e
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs  (#12858)

* sycl : Implemented reorder Q4_0 mmvq

Signed-off-by: Alberto Cabrera <redacted>
* sycl : Fixed mmvq being called when reorder is disabled

* sycl : Improved comments in the quants header

Signed-off-by: Alberto Cabrera <redacted>
* Use static_assert

* safe_div -> ceil_div

* Clarify qi comment

* change the reorder tensor from init to execute OP

* dbg

* Undo changes to test-backend-ops

* Refactor changes on top of q4_0 reorder fix

* Missing Reverts

* Refactored opt_for_reorder logic to simplify code path

* Explicit inlining and unroll

* Renamed mul_mat_algo enum for consistency

---------

Signed-off-by: Alberto Cabrera <redacted>
Co-authored-by: romain.biessy <redacted>
ggml/src/ggml-sycl/backend.hpp
ggml/src/ggml-sycl/common.hpp
ggml/src/ggml-sycl/ggml-sycl.cpp
ggml/src/ggml-sycl/mmvq.cpp
ggml/src/ggml-sycl/quants.hpp [new file with mode: 0644]
ggml/src/ggml-sycl/vecdotq.hpp