]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Faster Q3_K implementation on Metal (#2307)
authorKawrakow <redacted>
Fri, 21 Jul 2023 14:05:30 +0000 (17:05 +0300)
committerGitHub <redacted>
Fri, 21 Jul 2023 14:05:30 +0000 (17:05 +0300)
commit4d76a5f49b9b5382dba5d13d92edb9159536c225
tree7bb4a3231985d1fb254cb5c38b65daba53cdbe4b
parent0db14fef06836caaa13cc123c0a24dc598bdb9f0
Faster Q3_K implementation on Metal (#2307)

* Faster Q3_K on Metal

* Additional Q3_K speedup on Metal

* Q3_K for QK_K = 64

* Better Q3_K for QK_K = 64

21.6 ms/t -> 21.1 ms/t

---------

Co-authored-by: Iwan Kawrakow <redacted>
ggml-metal.m
ggml-metal.metal