]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml-cuda : use graph allocator (#2684)
authorslaren <redacted>
Tue, 22 Aug 2023 13:25:19 +0000 (15:25 +0200)
committerGitHub <redacted>
Tue, 22 Aug 2023 13:25:19 +0000 (15:25 +0200)
commit1123f7fbdfb8012e46f05e903e6f675922916378
tree27f3700a672e8f0d09d86797ce1c199ff72a4d51
parentef3f333d3775600d1646a9fa249aca532d15fb89
ggml-cuda : use graph allocator (#2684)

use a different function for no_alloc to avoid breaking backwards compat, fixes lora

remove 512 n_batch limit

fixed 2048 batch size

cleanup

Co-authored-by: Johannes Gäßler <redacted>
common/common.cpp
ggml-cuda.cu
ggml-cuda.h
llama.cpp