git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	slaren <redacted>
	Mon, 18 Mar 2024 10:03:04 +0000 (11:03 +0100)
committer	GitHub <redacted>
	Mon, 18 Mar 2024 10:03:04 +0000 (11:03 +0100)
commit	2bf8d0f7c4cc1235755ad06961ca761e458c5e55
tree	d2a462deb3c0e34cfb26eab6881a65bfb9fc3b28	tree
parent	496bc79bc2b79bfd6124b8687a8dbd6a646e9b06	commit \| diff

backend : offload large batches to GPU (#6083)

* backend : offload large batches to GPU

* fix hip

* code cleanup

* fix CUDA split buffers

* Update ggml-backend-impl.h

Co-authored-by: Johannes Gäßler <redacted>
* cuda : fix memset without set_device

* imatrix : remove sched affix from weight names

* sched : add a new split if the current one has too many inputs
reduce max inputs per split
more cleanup

* update backends

ggml-ci

---------

Co-authored-by: Johannes Gäßler <redacted>

14 files changed:

examples/imatrix/imatrix.cpp		diff \| blob \| history
examples/llama-bench/llama-bench.cpp		diff \| blob \| history
ggml-alloc.c		diff \| blob \| history
ggml-backend-impl.h		diff \| blob \| history
ggml-backend.c		diff \| blob \| history
ggml-backend.h		diff \| blob \| history
ggml-cuda.cu		diff \| blob \| history
ggml-cuda.h		diff \| blob \| history
ggml-kompute.cpp		diff \| blob \| history
ggml-metal.m		diff \| blob \| history
ggml-sycl.cpp		diff \| blob \| history
ggml-vulkan.cpp		diff \| blob \| history
ggml.c		diff \| blob \| history
llama.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom