]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
gguf-py, convert-hf : model conversion support for T5 and FLAN-T5 model variants...
authorfairydreaming <redacted>
Mon, 24 Jun 2024 05:06:05 +0000 (07:06 +0200)
committerGitHub <redacted>
Mon, 24 Jun 2024 05:06:05 +0000 (07:06 +0200)
commitde0d6a68ac99f307fe889c48e21124bc3b7ca29a
treef8e2d764556465bbb0a27a5f95e3cc2853d4d9b7
parent95f57bb5d5b18ef0beb2702a0d6c06e46804075c
gguf-py, convert-hf : model conversion support for T5 and FLAN-T5 model variants (#5763)

* gguf-py : add T5 model architecture

* gguf-py : add separate tensors for encoder and decoder

* gguf-py : add new model header parameters: decoder_start_token_id, attention.relative_buckets_count, tokenizer.ggml.remove_extra_whitespaces, tokenizer.ggml.precompiled_charsmap

* convert-hf : add model conversion support for T5ForConditionalGeneration and T5WithLMHeadModel

---------

Co-authored-by: Stanisław Szymczyk <redacted>
convert-hf-to-gguf.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
gguf-py/gguf/tensor_mapping.py