]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048)
authorBrian <redacted>
Sun, 7 Jul 2024 12:58:43 +0000 (22:58 +1000)
committerGitHub <redacted>
Sun, 7 Jul 2024 12:58:43 +0000 (22:58 +1000)
commitf7cab35ef9e66c57b4a416d09eb6c814e0ba4e4c
tree29ab9678eec24ea9a335bd6948de1b4398184025
parent905942abdba5ba0b28a1b0805e51e4f818c54bc9
gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048)

CLI to hash GGUF files to detect difference on a per model and per tensor level

The hash type we support is:

- `--xxh64`: use xhash 64bit hash mode (default)
- `--sha1`: use sha1
- `--uuid`: use uuid
- `--sha256`: use sha256

While most POSIX systems already have hash checking programs like sha256sum, it
is designed to check entire files. This is not ideal for our purpose if we want
to check for consistency of the tensor data even if the metadata content of the
gguf KV store has been updated.

This program is designed to hash a gguf tensor payload on a 'per tensor layer'
in addition to a 'entire tensor model' hash. The intent is that the entire
tensor layer can be checked first but if there is any detected inconsistencies,
then the per tensor hash can be used to narrow down the specific tensor layer
that has inconsistencies.

Co-authored-by: Georgi Gerganov <redacted>
17 files changed:
Makefile
examples/CMakeLists.txt
examples/gguf-hash/CMakeLists.txt [new file with mode: 0644]
examples/gguf-hash/README.md [new file with mode: 0644]
examples/gguf-hash/deps/rotate-bits/package.json [new file with mode: 0644]
examples/gguf-hash/deps/rotate-bits/rotate-bits.h [new file with mode: 0644]
examples/gguf-hash/deps/sha1/package.json [new file with mode: 0644]
examples/gguf-hash/deps/sha1/sha1.c [new file with mode: 0644]
examples/gguf-hash/deps/sha1/sha1.h [new file with mode: 0644]
examples/gguf-hash/deps/sha256/package.json [new file with mode: 0644]
examples/gguf-hash/deps/sha256/sha256.c [new file with mode: 0644]
examples/gguf-hash/deps/sha256/sha256.h [new file with mode: 0644]
examples/gguf-hash/deps/xxhash/clib.json [new file with mode: 0644]
examples/gguf-hash/deps/xxhash/xxhash.c [new file with mode: 0644]
examples/gguf-hash/deps/xxhash/xxhash.h [new file with mode: 0644]
examples/gguf-hash/gguf-hash.cpp [new file with mode: 0644]
gguf-py/scripts/gguf_hash.py [new file with mode: 0755]