git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	slaren <redacted>
	Mon, 1 May 2023 16:11:07 +0000 (18:11 +0200)
committer	GitHub <redacted>
	Mon, 1 May 2023 16:11:07 +0000 (18:11 +0200)
commit	58b367c2d757c0ea12aec672382462b42204c724
tree	b2fa89daf71c08788c44e3fb9abf1747ec8ee65d	tree
parent	ea3a0ad6b6b5ca4693b94acd4cb32e2803f66fae	commit \| diff

cuBLAS: refactor and optimize f16 mat mul performance (#1259)

* cuBLAS: refactor, convert fp16 to fp32 on device

* cuBLAS: use multiple streams, choose smartly between mul_mat_q and mul_mat_f16

* fix build

* cuBLAS: update block_q5_1

ggml-cuda.cu		diff \| blob \| history
ggml-cuda.h		diff \| blob \| history
ggml.c		diff \| blob \| history
ggml.h		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom