]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Option to split during conversion (#6942)
authorChristian Zhou-Zheng <redacted>
Mon, 24 Jun 2024 09:42:03 +0000 (05:42 -0400)
committerGitHub <redacted>
Mon, 24 Jun 2024 09:42:03 +0000 (19:42 +1000)
commit52fc8705a0617452df08333e1161838726c322b4
treef0c9059e40b347b820e8653c2dfad425755a651c
parent8cb508d0d5c024e12692370d85237b45469a004b
Option to split during conversion (#6942)

* support splits in convert.py

* Support split by size and dry run to write estimated shards/filesizes

* Move split functionality to new GGUFManager class

* fix improper function signature

* tentative push of convert-hf-to-gguf support

* resolve merge + SplitArguments for easier parsing

* Fix eager tensor memory leak and remove convert.py changes

Removed a memory leak caused by unexpected reference retention to eager tensors.

Also removed GGUFManager functionality in convert.py in favor of specializing for convert-hf-to-gguf.py.

* refactor SplitStrategy to be a deque

Instead of having SplitStrategy have a `data` field that is a deque, just have SplitStrategy be a subclass of deque itself.

* fix Q8 quantization

* remove unnecessary imports in gguf_manager

* fix final? merge issue

* fix gguf_writer placement and remove comments

* oops, actually fix gguf_writer placement

* reduce duplicated code from gguf_writer

* further simplify GGUFManager

* simplify even further and standardize with GGUFWriter

* reduce diffs with master

* form shards while adding tensors, SHA256 sums agree with master

* re-add type hint

Co-authored-by: compilade <redacted>
* GGUFWriter compatibility fix

Co-authored-by: compilade <redacted>
* Shard dataclass and un-negative dont_add_architecture

* type consistency in format_n_bytes_to_str

* move kv keys to constants.py

* make pathlib explicit

* base-1024 bytes to base-1000

* rename GGUFManager to GGUFWriterSplit

* Update gguf-py/gguf/constants.py

Co-authored-by: compilade <redacted>
* fix convert-hf-to-gguf.py permissions

* fix line endings

* Update gguf-py/gguf/gguf_writer_split.py

Co-authored-by: compilade <redacted>
* convert-hf : restore executable file permission

* examples/convert-legacy-llama.py: restore executable file permission

* reinstate original gguf package import and fix type annotation

* attempt to appease the linter

* attempt 2 to appease the linter

* attempt 3 to appease the linter

* comma consistency

* Update convert-hf-to-gguf.py

Co-authored-by: compilade <redacted>
* edit cmd line args

* use simplification from #7827

* kv/ti data are still wrong

* try to refactor kv data (still fails)

* fix ti data messiness

* tidy up

* fix linting

* actually make the linter happy

* cleanup round 1

* remove SplitStrategy, SplitArguments

* appease linter

* fix typing and clean up

* fix linting

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* progress bar, fix split logic

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* catch oversights

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* swap bar orders

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* compatibility fix

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update convert-hf-to-gguf.py

Co-authored-by: compilade <redacted>
---------

Co-authored-by: Brian <redacted>
Co-authored-by: compilade <redacted>
convert-hf-to-gguf.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py