git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

summary | shortlog | log | commit | commitdiff | tree
(parent: 0b3bf96)

author	Ivan <redacted>
	Tue, 24 Sep 2024 00:14:24 +0000 (03:14 +0300)
committer	GitHub <redacted>
	Tue, 24 Sep 2024 00:14:24 +0000 (02:14 +0200)
commit	116efee0eef09d8c3c4c60b52fa01b56ddeb432c
tree	cb5f9f85e27749fdf6559580546f6f08bf991aa3	tree
parent	0b3bf966f47bf2ba88e5d4e3ed429602008c7e63	commit \| diff

cuda: add q8_0->f32 cpy operation (#9571)

llama: enable K-shift for quantized KV cache
It will fail on unsupported backends or quant types.

ggml/src/ggml-cuda.cu		diff \| blob \| history
ggml/src/ggml-cuda/cpy.cu		diff \| blob \| history
src/llama.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp