git.djapps.eu Git - pkg/ggml/sources/llama.cpp/log

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/log

overview / pkg / ggml / sources / llama.cpp / log

Christian Zhou-Zheng [Mon, 24 Jun 2024 09:42:03 +0000 (05:42 -0400)]

Option to split during conversion (#6942)

* support splits in convert.py

* Support split by size and dry run to write estimated shards/filesizes

* Move split functionality to new GGUFManager class

* fix improper function signature

* tentative push of convert-hf-to-gguf support

* resolve merge + SplitArguments for easier parsing

* Fix eager tensor memory leak and remove convert.py changes

Removed a memory leak caused by unexpected reference retention to eager tensors.

Also removed GGUFManager functionality in convert.py in favor of specializing for convert-hf-to-gguf.py.

* refactor SplitStrategy to be a deque

Instead of having SplitStrategy have a `data` field that is a deque, just have SplitStrategy be a subclass of deque itself.

* fix Q8 quantization

* remove unnecessary imports in gguf_manager

* fix final? merge issue

* fix gguf_writer placement and remove comments

* oops, actually fix gguf_writer placement

* reduce duplicated code from gguf_writer

* further simplify GGUFManager

* simplify even further and standardize with GGUFWriter

* reduce diffs with master

* form shards while adding tensors, SHA256 sums agree with master

* re-add type hint

Co-authored-by: compilade <redacted>
* GGUFWriter compatibility fix

Co-authored-by: compilade <redacted>
* Shard dataclass and un-negative dont_add_architecture

* type consistency in format_n_bytes_to_str

* move kv keys to constants.py

* make pathlib explicit

* base-1024 bytes to base-1000

* rename GGUFManager to GGUFWriterSplit

* Update gguf-py/gguf/constants.py

Co-authored-by: compilade <redacted>
* fix convert-hf-to-gguf.py permissions

* fix line endings

* Update gguf-py/gguf/gguf_writer_split.py

Co-authored-by: compilade <redacted>
* convert-hf : restore executable file permission

* examples/convert-legacy-llama.py: restore executable file permission

* reinstate original gguf package import and fix type annotation

* attempt to appease the linter

* attempt 2 to appease the linter

* attempt 3 to appease the linter

* comma consistency

* Update convert-hf-to-gguf.py

Co-authored-by: compilade <redacted>
* edit cmd line args

* use simplification from #7827

* kv/ti data are still wrong

* try to refactor kv data (still fails)

* fix ti data messiness

* tidy up

* fix linting

* actually make the linter happy

* cleanup round 1

* remove SplitStrategy, SplitArguments

* appease linter

* fix typing and clean up

* fix linting

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* progress bar, fix split logic

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* catch oversights

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* swap bar orders

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* compatibility fix

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: compilade <redacted>
* Update convert-hf-to-gguf.py

Co-authored-by: compilade <redacted>
---------

Co-authored-by: Brian <redacted>
Co-authored-by: compilade <redacted>