]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
metal : more optimizations (#2959)
authorKawrakow <redacted>
Sun, 3 Sep 2023 08:06:22 +0000 (11:06 +0300)
committerGitHub <redacted>
Sun, 3 Sep 2023 08:06:22 +0000 (11:06 +0300)
commitca82cf7bac0c91d03e3d320b3a865dd006f854ac
tree02b91ac7d85eba9234fb0d0d4152218909135bcb
parent6a31a3bd9806c85ed08266f6ab65181da0f30d03
metal : more optimizations (#2959)

* Very minor speedup via simd-group synchronization in f16 x f32

* Another very minor speedup on metal

* Quite significant PP speedup on metal

* Another attempt

* Minor

* Massive improvement for TG for fp16

* ~4-5% improvement for Q8_0 TG on metal

---------

Co-authored-by: Iwan Kawrakow <redacted>
Co-authored-by: Georgi Gerganov <redacted>
ggml-metal.m
ggml-metal.metal