git.djapps.eu Git - pkg/ggml/sources/ggml/commit

author	Yavor Ivanov <redacted>
	Thu, 12 Oct 2023 14:08:09 +0000 (17:08 +0300)
committer	GitHub <redacted>
	Thu, 12 Oct 2023 14:08:09 +0000 (17:08 +0300)
commit	8e8283253685b14114eb1c37231f378ca5e8c4cd
tree	031c258a5a778b643ed0aeb01ae6e88c17df1562	tree
parent	159bdae6ed20e1d408238fc5850f765134ee7935	commit \| diff

gpt-2 : add batched decoding example (#572)

* Initial attempt to make gpt2 do parallel decoding

* Fix crash on trying to use empty embd

* Make it work for n_parallel=1

* Add short way of passing n_parallel argument

* Move gpt-2 batched to a separate target and cpp file

* Add batched sample output to README and remove hardcoded model path and prompt

* gpt-2-batched : fix n_kv heuristic

* Free batch at end of example

* gpt-2-batched : simplify kv cache stuff (#574)

ggml-ci

* Fix not generating n_predict tokens and fix warn

* minor : readme

* Add check for end token and mark the stream as finished

---------

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: YavorGIvanov <redacted>

examples/common.cpp		diff \| blob \| history
examples/common.h		diff \| blob \| history
examples/gpt-2/CMakeLists.txt		diff \| blob \| history
examples/gpt-2/README.md		diff \| blob \| history
examples/gpt-2/main-batched.cpp	[new file with mode: 0644]	blob