]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
cuda : improve cuda pool efficiency using virtual memory (llama/4606)
authorslaren <redacted>
Sun, 24 Dec 2023 13:34:22 +0000 (14:34 +0100)
committerGeorgi Gerganov <redacted>
Wed, 27 Dec 2023 09:40:50 +0000 (11:40 +0200)
commit0a476f7b13d981403523a10ef5746877f3d962dd
treef30e1f92fe23dc9c3ecd5c6752a3fc75dc0540cd
parentfa13de7541216b824ed0b12f8edf9a7717a503a4
cuda : improve cuda pool efficiency using virtual memory (llama/4606)

* cuda : improve cuda pool efficiency using virtual memory

* fix mixtral

* fix cmake build

* check for vmm support, disable for hip

ggml-ci

* fix hip build

* clarify granularity

* move all caps to g_device_caps

* refactor error checking

* add cuda_pool_alloc, refactor most pool allocations

ggml-ci

* fix hip build

* CUBLAS_TF32_TENSOR_OP_MATH is not a macro

* more hip crap

* llama : fix msvc warnings

* ggml : fix msvc warnings

* minor

* minor

* cuda : fallback to CPU on host buffer alloc fail

* Update ggml-cuda.cu

Co-authored-by: Johannes Gäßler <redacted>
* Update ggml-cuda.cu

Co-authored-by: Johannes Gäßler <redacted>
* ensure allocations are always aligned

* act_size -> actual_size

---------

Co-authored-by: Johannes Gäßler <redacted>
include/ggml/ggml.h
src/ggml-backend.c
src/ggml-cuda.cu
src/ggml.c
tests/test-grad0.cpp