]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml : parallelize FP32 conversion when using BLAS (#5045)
authorReinforce-II <redacted>
Mon, 22 Jan 2024 13:15:08 +0000 (21:15 +0800)
committerGitHub <redacted>
Mon, 22 Jan 2024 13:15:08 +0000 (15:15 +0200)
commit780e24a22eb595b705cbe8284771e9ceff1c4dd2
tree0c11cc73dc7bf38e3978428eebf6ca224c551d56
parent3ce7e8f8e7ccfce07e5947ac5f1f3f4628cf68ea
ggml : parallelize FP32 conversion when using BLAS (#5045)

* make GGML_TASK_INIT phase can be run in multithread

* multithreaded dequantize in mul_mat when using blas library

* minor fixes

* update outdated comment
* fix coding style

* simplify code

Co-authored-by: Georgi Gerganov <redacted>
---------

Co-authored-by: Georgi Gerganov <redacted>
ggml.c