git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	slaren <redacted>
	Sun, 24 Dec 2023 13:34:22 +0000 (14:34 +0100)
committer	GitHub <redacted>
	Sun, 24 Dec 2023 13:34:22 +0000 (14:34 +0100)
commit	5bf3953d7e9831ea22b0bc017ce97409b801ccf1
tree	48c0136d9943fb9cca22209894464970549c24b5	tree
parent	708e179e8562c2604240df95a2241dea17fd808b	commit \| diff

cuda : improve cuda pool efficiency using virtual memory (#4606)

* cuda : improve cuda pool efficiency using virtual memory

* fix mixtral

* fix cmake build

* check for vmm support, disable for hip

ggml-ci

* fix hip build

* clarify granularity

* move all caps to g_device_caps

* refactor error checking

* add cuda_pool_alloc, refactor most pool allocations

ggml-ci

* fix hip build

* CUBLAS_TF32_TENSOR_OP_MATH is not a macro

* more hip crap

* llama : fix msvc warnings

* ggml : fix msvc warnings

* minor

* minor

* cuda : fallback to CPU on host buffer alloc fail

* Update ggml-cuda.cu

Co-authored-by: Johannes Gäßler <redacted>
* Update ggml-cuda.cu

Co-authored-by: Johannes Gäßler <redacted>
* ensure allocations are always aligned

* act_size -> actual_size

---------

Co-authored-by: Johannes Gäßler <redacted>

CMakeLists.txt		diff \| blob \| history
Makefile		diff \| blob \| history
ggml-backend.c		diff \| blob \| history
ggml-cuda.cu		diff \| blob \| history
ggml.c		diff \| blob \| history
ggml.h		diff \| blob \| history
llama.cpp		diff \| blob \| history
tests/test-grad0.cpp		diff \| blob \| history