]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
cuda : performance optimizations (#1530)
authorJohannes Gäßler <redacted>
Thu, 25 May 2023 21:07:29 +0000 (23:07 +0200)
committerGitHub <redacted>
Thu, 25 May 2023 21:07:29 +0000 (00:07 +0300)
commit1fcdcc28b119a6608774d52de905931bd5f8a43d
treea28504b1f2b0ed7d4b550316c37a9b7e25de889c
parentac7876ac20124a15a44fd6317721ff1aa2538806
cuda : performance optimizations (#1530)

* xor hack

* block y dim

* loop unrolling

* Fixed cmake LLAMA_CUDA_BY option

* Removed hipblas compatibility code

* Define GGML_CUDA_DMMV_BLOCK_Y if not defined

* Fewer iters, more ops per iter

* Renamed DMMV X/Y compilation options
CMakeLists.txt
Makefile
ggml-cuda.cu