]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
cuda : use CUDA memory pool with async memory allocation/deallocation when available...
authorOleksii Maryshchenko <redacted>
Thu, 2 Nov 2023 17:10:39 +0000 (18:10 +0100)
committerGitHub <redacted>
Thu, 2 Nov 2023 17:10:39 +0000 (19:10 +0200)
commitd6069051de7165a4e06662c89257f5d2905bb156
tree5dd3a4c2bb9293bd82e87a91a59a17dc446568d3
parent4ff1046d75e64f0e556d8dcd930ea25c23eb8b18
cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)

* Using cuda memory pools for async alloc/dealloc.

* If cuda device doesnt support memory pool than use old implementation.

* Removed redundant cublasSetStream

---------

Co-authored-by: Oleksii Maryshchenko <redacted>
ggml-cuda.cu