git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Diego Devesa <redacted>
	Thu, 21 Nov 2024 17:18:50 +0000 (18:18 +0100)
committer	GitHub <redacted>
	Thu, 21 Nov 2024 17:18:50 +0000 (18:18 +0100)
commit	a5e47592b6171ae21f3eaa1aba6fb2b707875063
tree	e31c54c85493203c8b8a9ccd7bd41ac0aa8c3dda	tree
parent	1bb30bf28cb5a7adf111bc41c935bdaf128397e7	commit \| diff

cuda : optimize argmax (#10441)

* cuda : optimize argmax

* remove unused parameter

ggml-ci

* fixup : use full warps

ggml-ci

* Apply suggestions from code review

Co-authored-by: Johannes Gäßler <redacted>
* fix ub

* ggml : check ne00 <= INT32_MAX in argmax and argsort

---------

Co-authored-by: Johannes Gäßler <redacted>

Packaging of ggml-org/llama.cpp

RSS Atom

ggml/src/ggml-cuda/argmax.cu		diff \| blob \| history
ggml/src/ggml-cuda/common.cuh		diff \| blob \| history
ggml/src/ggml-cuda/quantize.cu		diff \| blob \| history
ggml/src/ggml.c		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history