]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
metal : optimize FA vec for large sequences and BS <= 8 (#15566)
authorGeorgi Gerganov <redacted>
Tue, 26 Aug 2025 11:22:14 +0000 (14:22 +0300)
committerGitHub <redacted>
Tue, 26 Aug 2025 11:22:14 +0000 (14:22 +0300)
commitb3964c1e890ef8c947afb36a5124ce6fcb2136d4
treeba7664d4ae07bda38f443673d34876d6400da612
parent79a546220c719e6a70627b243a478ab8d84dc9e1
metal : optimize FA vec for large sequences and BS <= 8 (#15566)

* metal : optmize FA vec for large heads and sequences

* metal : adjust small-batch mul mv kernels

ggml-ci

* batched-bench : fix total speed computation

ggml-ci

* cont : add comments

ggml-ci
ggml/src/ggml-metal/ggml-metal-impl.h
ggml/src/ggml-metal/ggml-metal.m
ggml/src/ggml-metal/ggml-metal.metal
tools/batched-bench/batched-bench.cpp