git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Kawrakow <redacted>
	Sat, 10 Jun 2023 08:28:11 +0000 (11:28 +0300)
committer	GitHub <redacted>
	Sat, 10 Jun 2023 08:28:11 +0000 (11:28 +0300)
commit	e9b66ee9829039d4ab54550d6222e42a0b31e52a
tree	d0dbe2408722095b5ba9aa3cb28692dd1b4f7bd1	tree
parent	4f0154b0bad775ac4651bf73b5c216eb43c45cdc	commit \| diff

metal : add Q4_1 implementation (#1785)

23.3 ms / token, so just ~1% slower than q4_0.
Achieves 290 GB/s memory throughput.

Co-authored-by: Iwan Kawrakow <redacted>

ggml-metal.m		diff \| blob \| history
ggml-metal.metal		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom