]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : add `use_direct_io` flag for model loading (#18166)
authorJulius Tischbein <redacted>
Thu, 8 Jan 2026 06:35:30 +0000 (07:35 +0100)
committerGitHub <redacted>
Thu, 8 Jan 2026 06:35:30 +0000 (08:35 +0200)
commit2038101bd9b1dcf45b5410b969fbc5206e25d993
tree6c4d7f1a733f56a49010a0b01cd3f1b9fc0ae9fd
parent568371a7264c30ad4583f1859cb815dfc0bc14fa
llama : add `use_direct_io` flag for model loading (#18166)

* Adding --direct-io flag for model loading

* Fixing read_raw() calls

* Fixing Windows read_raw_at

* Changing type off_t to size_t for windows and Renaming functions

* disable direct io when mmap is explicitly enabled

* Use read_raw_unsafe when upload_backend is available, not functional on some devices with Vulkan and SYCL

* Fallback to std::fread in case O_DIRECT fails due to bad address

* Windows: remove const keywords and unused functions

* Update src/llama-mmap.cpp

Co-authored-by: Georgi Gerganov <redacted>
---------

Co-authored-by: jtischbein <redacted>
Co-authored-by: Georgi Gerganov <redacted>
12 files changed:
common/arg.cpp
common/common.cpp
common/common.h
examples/diffusion/diffusion-cli.cpp
include/llama.h
src/llama-mmap.cpp
src/llama-mmap.h
src/llama-model-loader.cpp
src/llama-model-loader.h
src/llama-model.cpp
src/llama-quant.cpp
src/llama.cpp