git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Johannes Gäßler <redacted>
	Mon, 24 Jun 2024 10:41:23 +0000 (12:41 +0200)
committer	GitHub <redacted>
	Mon, 24 Jun 2024 10:41:23 +0000 (12:41 +0200)
commit	9a590c82262dd518137f85406e65e452fdf2aca3
tree	f722351d4e9c0435351723122df3f7f1d203ed1d	tree
parent	52fc8705a0617452df08333e1161838726c322b4	commit \| diff

CUDA: optimize MMQ int8 tensor core performance (#8062)

* CUDA: optimize MMQ int8 tensor core performance

* only a single get_mma_tile_x_k function

* simplify code, make functions constexpr

ggml-cuda/common.cuh		diff \| blob \| history
ggml-cuda/mma.cuh		diff \| blob \| history
ggml-cuda/mmq.cuh		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom