git.djapps.eu Git - pkg/ggml/sources/ggml/commit

author	slaren <redacted>
	Thu, 7 Dec 2023 08:51:46 +0000 (09:51 +0100)
committer	GitHub <redacted>
	Thu, 7 Dec 2023 08:51:46 +0000 (09:51 +0100)
commit	990f931f674ab8e9735ed2693faf20495be58604
tree	113cab939036805f4b2e94da3d2195ef8f930163	tree
parent	3f669426fdf2fdc64c999c1768ef8b9fc0823c7b	commit \| diff

test-backend-ops : add performance eval mode + improve CUDA repeat and binary broadcast ops performance (#636)

* ggml-cuda : implement repeat with bin_bcast

* ggml-cuda : change supports_op for mul_mat to match compute_forward

* test-backend-ops : add performance eval mode

* improve formatting

* add sd test cases

* fix test case

* ggml-cuda : bin_bcast: better block sizes, two elements per thread

* metal : add dim3 broadcast support for mul mat

* cleanup

* typo

* metal : enable mul mat-vec for dim2 > 1

* metal : mul mat-vec support dim3 broadcasts

ggml-ci

* ggml-cuda : fix bin_bcast for ne0=1
ggml-ci

* ggml-cuda : limit block size z dim to 64

* test-backend-ops : add test cases

* test-backend-ops : add warmup run, print test type before trying to compute

* ggml-cuda : bin_bcast: collapse dimensions when possible, add fallback kernel for large tensors
ggml-ci

* test-backend-ops : avoid division by zero

---------

Co-authored-by: Georgi Gerganov <redacted>

src/ggml-cuda.cu		diff \| blob \| history
src/ggml-metal.m		diff \| blob \| history
src/ggml-metal.metal		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history