git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	0cc4m <redacted>
	Sun, 4 Jun 2023 06:12:05 +0000 (08:12 +0200)
committer	GitHub <redacted>
	Sun, 4 Jun 2023 06:12:05 +0000 (08:12 +0200)
commit	dcb2ed48268e421baf25adc00d602dad0f415564
tree	261ef84fe660d06fce90c58fc01a16ae0e69be52	tree
parent	d8bd0013e8768aaa3dc9cfc1ff01499419d5348e	commit \| diff

OpenCL: Fix duplication of layers in VRAM and RAM, add GPU mul kernel (#1653)

* Use events instead of clFinish, where possible

* OpenCL: Don't load gpu layers into RAM, add mul_f32 kernel

* Reduce queueing overhead for contiguous tensors by using single mul kernel call

* Adapt to #1612 cl_mem malloc changes

* Reduce code duplication between cuda and opencl branches

* Improve implementation

ggml-opencl.cpp		diff \| blob \| history
ggml-opencl.h		diff \| blob \| history
ggml.c		diff \| blob \| history
llama.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom