]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
metal: somewhat faster f16 x f32 matrix multiply kernel (#2951)
authorKawrakow <redacted>
Fri, 1 Sep 2023 08:15:57 +0000 (11:15 +0300)
committerGitHub <redacted>
Fri, 1 Sep 2023 08:15:57 +0000 (11:15 +0300)
commite8d91589258f9204397a7ac5f9b3c857835c98f8
tree5909f71a59fc0822fd4310c8208655b43022e575
parentbce1fef328941499dc0acb76cc7fd7ac90449c2f
metal: somewhat faster f16 x f32 matrix multiply kernel (#2951)

* Somewhat faster f16 x f32 matrix multiply kernel

* Better use 32 thread groups for f16 x f32

---------

Co-authored-by: Iwan Kawrakow <redacted>
ggml-metal.m
ggml-metal.metal