git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	slaren <redacted>
	Thu, 21 Dec 2023 20:07:46 +0000 (21:07 +0100)
committer	GitHub <redacted>
	Thu, 21 Dec 2023 20:07:46 +0000 (21:07 +0100)
commit	d232aca5a73b290e218a2e48b91023d5e994203f
tree	e763648880fad8ef44be54c9cb59c9c7dbda4168	tree
parent	31f27758faf4a4bd08101a57c7ec3a473f771f86	commit \| diff

llama : initial ggml-backend integration (#4520)

* llama : initial ggml-backend integration

* add ggml-metal

* cuda backend can be used though ggml-backend with LLAMA_GGML_BACKEND_CUDA_TEST
access all tensor data with ggml_backend_tensor_get/set

* add ggml_backend_buffer_clear
zero-init KV cache buffer

* add ggml_backend_buffer_is_hos, used to avoid copies if possible when accesing tensor data

* disable gpu backends with ngl 0

* more accurate mlock

* unmap offloaded part of the model

* use posix_fadvise64(.., POSIX_FADV_SEQUENTIAL) to improve performance with mmap

* update quantize and lora

* update session copy/set to use ggml-backend

ggml-ci

* use posix_fadvise instead of posix_fadvise64

* ggml_backend_alloc_ctx_tensors_from_buft : remove old print

* llama_mmap::align_offset : use pointers instead of references for out parameters

* restore progress_callback behavior

* move final progress_callback call to load_all_data

* cuda : fix fprintf format string (minor)

* do not offload scales

* llama_mmap : avoid unmapping the same fragments again in the destructor

* remove unnecessary unmap

* metal : add default log function that prints to stderr, cleanup code

ggml-ci

---------

Co-authored-by: Georgi Gerganov <redacted>

Makefile		diff \| blob \| history
ggml-alloc.c		diff \| blob \| history
ggml-backend-impl.h		diff \| blob \| history
ggml-backend.c		diff \| blob \| history
ggml-backend.h		diff \| blob \| history
ggml-cuda.cu		diff \| blob \| history
ggml-metal.h		diff \| blob \| history
ggml-metal.m		diff \| blob \| history
ggml.c		diff \| blob \| history
ggml.h		diff \| blob \| history
llama.cpp		diff \| blob \| history