]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
metal : add Q4_1 implementation (#1785)
authorKawrakow <redacted>
Sat, 10 Jun 2023 08:28:11 +0000 (11:28 +0300)
committerGitHub <redacted>
Sat, 10 Jun 2023 08:28:11 +0000 (11:28 +0300)
commite9b66ee9829039d4ab54550d6222e42a0b31e52a
treed0dbe2408722095b5ba9aa3cb28692dd1b4f7bd1
parent4f0154b0bad775ac4651bf73b5c216eb43c45cdc
metal : add Q4_1 implementation (#1785)

23.3 ms / token, so just ~1% slower than q4_0.
Achieves 290 GB/s memory throughput.

Co-authored-by: Iwan Kawrakow <redacted>
ggml-metal.m
ggml-metal.metal