git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Georgi Gerganov <redacted>
	Wed, 5 Apr 2023 19:07:33 +0000 (22:07 +0300)
committer	GitHub <redacted>
	Wed, 5 Apr 2023 19:07:33 +0000 (22:07 +0300)
commit	986b6ce9f99503c51ec5afd8a10baa32359434c6
tree	f4655b45b130b908729eb1407ca9e016c05f21a4	tree
parent	34162989297fdfe3ab7305451ce55bc87e3f4c9c	commit \| diff

ggml, llama : avoid heavy V transpose + improvements (#775)

ggml :

- added ggml_view_3d()
- ggml_view_tensor() now inherits the stride too
- reimplement ggml_cpy() to account for dst stride
- no longer require tensor->data to be memory aligned

llama :

- compute RoPE on 32-bit tensors (should be more accurate)
- store RoPE-ed K in the KV cache
- store transposed V in the KV cache (significant speed-up)
- avoid unnecessary Q copy

ggml.c		diff \| blob \| history
ggml.h		diff \| blob \| history
llama.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom