]>
git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Option to split during conversion (#6942)
* support splits in convert.py
* Support split by size and dry run to write estimated shards/filesizes
* Move split functionality to new GGUFManager class
* fix improper function signature
* tentative push of convert-hf-to-gguf support
* resolve merge + SplitArguments for easier parsing
* Fix eager tensor memory leak and remove convert.py changes
Removed a memory leak caused by unexpected reference retention to eager tensors.
Also removed GGUFManager functionality in convert.py in favor of specializing for convert-hf-to-gguf.py.
* refactor SplitStrategy to be a deque
Instead of having SplitStrategy have a `data` field that is a deque, just have SplitStrategy be a subclass of deque itself.
* fix Q8 quantization
* remove unnecessary imports in gguf_manager
* fix final? merge issue
* fix gguf_writer placement and remove comments
* oops, actually fix gguf_writer placement
* reduce duplicated code from gguf_writer
* further simplify GGUFManager
* simplify even further and standardize with GGUFWriter
* reduce diffs with master
* form shards while adding tensors, SHA256 sums agree with master
* re-add type hint
Co-authored-by: compilade <redacted>
* GGUFWriter compatibility fix
Co-authored-by: compilade <redacted>
* Shard dataclass and un-negative dont_add_architecture
* type consistency in format_n_bytes_to_str
* move kv keys to constants.py
* make pathlib explicit
* base-1024 bytes to base-1000
* rename GGUFManager to GGUFWriterSplit
* Update gguf-py/gguf/constants.py
Co-authored-by: compilade <redacted>
* fix convert-hf-to-gguf.py permissions
* fix line endings
* Update gguf-py/gguf/gguf_writer_split.py
Co-authored-by: compilade <redacted>
* convert-hf : restore executable file permission
* examples/convert-legacy-llama.py: restore executable file permission
* reinstate original gguf package import and fix type annotation
* attempt to appease the linter
* attempt 2 to appease the linter
* attempt 3 to appease the linter
* comma consistency
* Update convert-hf-to-gguf.py
Co-authored-by: compilade <redacted>
* edit cmd line args
* use simplification from #7827
* kv/ti data are still wrong
* try to refactor kv data (still fails)
* fix ti data messiness
* tidy up
* fix linting
* actually make the linter happy
* cleanup round 1
* remove SplitStrategy, SplitArguments
* appease linter
* fix typing and clean up
* fix linting
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* progress bar, fix split logic
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* catch oversights
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* swap bar orders
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* compatibility fix
* Update gguf-py/gguf/gguf_writer.py
Co-authored-by: compilade <redacted>
* Update convert-hf-to-gguf.py
Co-authored-by: compilade <redacted>
---------
Co-authored-by: Brian <redacted>
Co-authored-by: compilade <redacted>