git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Georgi Gerganov <redacted>
	Tue, 2 Jan 2024 19:07:47 +0000 (21:07 +0200)
committer	GitHub <redacted>
	Tue, 2 Jan 2024 19:07:47 +0000 (21:07 +0200)
commit	f3f62f0d835d559e80714bbeb05d03125574e3dd
tree	f92c9a828a38a6f4c681635ba08a4e32926434f3	tree
parent	0ef3ca2ac62016c0c545de1c89dc2e3e130f4a99	commit \| diff

metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)

* ggml : disable fast-math for Metal (cmake build only)

ggml-ci

* metal : fix Metal API debug warnings

* cmake : add -fno-inline for Metal build (#4545)

* metal : fix API debug warnings

* metal : fix compile warnings

* metal : use uint64_t for strides

* cmake : rename option to LLAMA_METAL_SHADER_DEBUG

* metal : fix mat-vec Q8_0 kernel for BS > 1

* metal : normalize mat-vec kernel signatures

* cmake : respect LLAMA_QKK_64 option

* metal : fix mat-vec Q4_K kernel for QK_K == 64

* metal : optimizing ggml_mul_mat_id (wip)

* metal : minor fix

* metal : opt mul_mm_id

ggml-metal.m		diff \| blob \| history
ggml-metal.metal		diff \| blob \| history