]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)
authorMatteo Boschini <redacted>
Tue, 1 Aug 2023 07:43:12 +0000 (09:43 +0200)
committerGitHub <redacted>
Tue, 1 Aug 2023 07:43:12 +0000 (10:43 +0300)
commit1873ff586bd8499a18f763632711bf15d253585e
treef5c52d81b59d9044b2cd2b3b584e05be268ec278
parent49e7cb5bb1f75c91dd5db7d2d88cbc11bd9ee0c5
metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)

* Added gqa8 kernel to allow llama-2-70B on metal

* Update ggml-metal.m

Co-authored-by: Cebtenzzre <redacted>
* Extend kernel_mul_mat_f16_f32 to handle gqa broadcast

* Added ne03==ne13 assertion

---------

Co-authored-by: Cebtenzzre <redacted>
ggml-metal.m
ggml-metal.metal