git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Kerfuffle <redacted>
	Wed, 30 Aug 2023 08:25:50 +0000 (02:25 -0600)
committer	GitHub <redacted>
	Wed, 30 Aug 2023 08:25:50 +0000 (11:25 +0300)
commit	dc07dc492ef9640bbb82904d7c7679f7bdcf6d76
tree	f9d80bc6ee29067e8e72521d75dfa2b92d85540e	tree
parent	ad9ddcff6ef322db5cf13785bd7c856b610d242e	commit \| diff

convert : various script cleanups/fixes + merges and special token handling (#2842)

* convert: Fix permute calls and method/func definitions

* Cleanups for gguf-py

* Minor types cleanups.

* Initial implementation of handling merges and special tokens

* convert: Handle special tokens and merges in vocab only mode

convert: Vocab only mode no longer requires loading model tensors

* gguf: Refactor tensor name mapping

* convert: Fix type hint for special_token_types in SpecialVocab

* Use common special vocab handling in various conversion scripts

* First pass at implementing suggested changes

* Second pass

* gguf: SpecialVocab: Fix issue with special token content not in a dict

gguf: SpecialVocab: Allow skipping handling of merges

* convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json

* convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer

* gguf: SpecialVocab: Actually set load_merges in object

* Uniform args parsing and vocab only mode for convert examples

* convert.py: Set gpt2 as tokenizer model when using BPE

* Squish last type warning in gguf.py - yay!

convert-falcon-hf-to-gguf.py		diff \| blob \| history
convert-gptneox-hf-to-gguf.py		diff \| blob \| history
convert-llama-7b-pth-to-gguf.py		diff \| blob \| history
convert-llama-ggmlv3-to-gguf.py		diff \| blob \| history
convert-llama-hf-to-gguf.py		diff \| blob \| history
convert-lora-to-ggml.py		diff \| blob \| history
convert.py		diff \| blob \| history
gguf-py/gguf/gguf.py		diff \| blob \| history
gguf-py/gguf/py.typed	[new file with mode: 0644]	blob
gguf-py/pyproject.toml		diff \| blob \| history