]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Fix garbled output with REPACK at high thread counts (#16956)
authorNoah <redacted>
Tue, 4 Nov 2025 05:04:59 +0000 (05:04 +0000)
committerGitHub <redacted>
Tue, 4 Nov 2025 05:04:59 +0000 (21:04 -0800)
commit1f5accb8d0056e6099cd5b772b1cb787dd590a13
treedac92a263a97b496c8b13a37822be188d00dfdf4
parent2759ccdb4adc8568add4316780d5e675519b0775
Fix garbled output with REPACK at high thread counts (#16956)

* Fix garbled output with REPACK at high thread counts

Fixed a race condition in the REPACK matrix multiplication code that caused garbled output when using 26+ threads (model-dependent threshold). The issue occurred because with high thread counts, the code forced chunk count to equal thread count, creating many small chunks. After aligning these chunks to NB_COLS boundaries, adjacent chunks could overlap, causing data corruption and race conditions. The fix enforces minimum chunk sizes based on NB_COLS and caps maximum chunk count to prevent creating too many tiny chunks, ensuring proper alignment without overlaps.

* Update ggml/src/ggml-cpu/repack.cpp

Co-authored-by: Georgi Gerganov <redacted>
* Update ggml/src/ggml-cpu/repack.cpp

Co-authored-by: Georgi Gerganov <redacted>
---------

Co-authored-by: Georgi Gerganov <redacted>
ggml/src/ggml-cpu/repack.cpp