]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)
authorDavid Huang <redacted>
Sun, 11 May 2025 12:18:39 +0000 (20:18 +0800)
committerGitHub <redacted>
Sun, 11 May 2025 12:18:39 +0000 (14:18 +0200)
commit7f323a589f8684c0eb722e7309074cb5eac0c8b5
treec7bbfc15dd78a3c17526285d01f0c3b1143cbc00
parent3eac209319a6726fd9687c6188fc6b916b65953d
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)
common/arg.cpp
common/common.cpp
common/common.h
ggml/include/ggml-backend.h
ggml/src/ggml-backend.cpp
include/llama.h
src/llama-context.cpp
src/llama-cparams.h
tests/test-opt.cpp
tools/llama-bench/llama-bench.cpp
tools/mtmd/clip.cpp