git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	slaren <redacted>
	Sat, 29 Apr 2023 00:04:18 +0000 (02:04 +0200)
committer	GitHub <redacted>
	Sat, 29 Apr 2023 00:04:18 +0000 (02:04 +0200)
commit	7fc50c051ae8a78e9643fdf172d12e20f2dd9b6c
tree	cc017db2f3443a39221ad319ab51df0925012e84	tree
parent	b1ee8f59b4101b46999a0995d9a34506f7285466	commit \| diff

cuBLAS: use host pinned memory and dequantize while copying (#1207)

* cuBLAS: dequantize simultaneously while copying memory

* cuBLAS: use host pinned memory

* cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory

* cuBLAS: also pin kv cache

* fix rebase

Makefile		diff \| blob \| history
ggml-cuda.cu		diff \| blob \| history
ggml-cuda.h		diff \| blob \| history
ggml.c		diff \| blob \| history
llama.cpp		diff \| blob \| history
llama_util.h		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom