]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
metal : new q4_0 matrix-vector kernel (#2188)
authorShouzheng Liu <redacted>
Wed, 12 Jul 2023 20:10:55 +0000 (16:10 -0400)
committerGitHub <redacted>
Wed, 12 Jul 2023 20:10:55 +0000 (23:10 +0300)
commit1cbf561466e957b25f0e8163c2386683f8674369
tree4d796b3189de81bd3a32dde500d1d2f46d06eb07
parent975221e9548ef6d9f4af8d39cdffc4811c050beb
metal : new q4_0 matrix-vector kernel (#2188)

Prefetch data to improve GPU utilization. ~48% faster for 33B model.
ggml-metal.m
ggml-metal.metal