git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Johannes Gäßler <redacted>
	Sat, 29 Jul 2023 21:04:44 +0000 (23:04 +0200)
committer	GitHub <redacted>
	Sat, 29 Jul 2023 21:04:44 +0000 (23:04 +0200)
commit	11f3ca06b8c66b0427aab0a472479da22553b472
tree	8e934ff0d93a78447d996b00561f7ff826c3533f	tree
parent	9baf9ef304f330009d5a93b7390280a0fd27c9a1	commit \| diff

CUDA: Quantized matrix matrix multiplication (#2160)

* mmq implementation for non k-quants

* q6_K

* q2_K

* q3_k

* q4_K

* vdr

* q5_K

* faster q8_1 loading

* loop unrolling

* add __restrict__

* q2_K sc_high

* GGML_CUDA_MMQ_Y

* Updated Makefile

* Update Makefile

* DMMV_F16 -> F16

* Updated README, CMakeLists

* Fix CMakeLists.txt

* Fix CMakeLists.txt

* Fix multi GPU out-of-bounds

CMakeLists.txt		diff \| blob \| history
Makefile		diff \| blob \| history
README.md		diff \| blob \| history
ggml-cuda.cu		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom