]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
cuda : optimize argmax (llama/10441)
authorDiego Devesa <redacted>
Thu, 21 Nov 2024 17:18:50 +0000 (18:18 +0100)
committerGeorgi Gerganov <redacted>
Sun, 8 Dec 2024 18:14:35 +0000 (20:14 +0200)
commit2a4b5c9d7eb06f7b84ab555e4b14c812a68eacdc
treea8066e62bdfa6fe15c645f47ee5c65da6dd5cf46
parent04662748aadc589be83a8dfdaedce2bc4d91fae0
cuda : optimize argmax (llama/10441)

* cuda : optimize argmax

* remove unused parameter

ggml-ci

* fixup : use full warps

ggml-ci

* Apply suggestions from code review

Co-authored-by: Johannes Gäßler <redacted>
* fix ub

* ggml : check ne00 <= INT32_MAX in argmax and argsort

---------

Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/argmax.cu
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/quantize.cu
ggml/src/ggml.c