]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
cpu: introduce chunking for repack matmuls and enable matmul-id chunking on ARM64...
authorMax Krasnyansky <redacted>
Thu, 30 Oct 2025 16:06:13 +0000 (09:06 -0700)
committerGitHub <redacted>
Thu, 30 Oct 2025 16:06:13 +0000 (09:06 -0700)
commit517b7170e1a4d733583c4b07c5b7a49acc05911c
tree6d8de747282fd662e964777e27b50ce6cce93e8d
parent835e918d8428f5119927d7150bf5a26176dedda0
cpu: introduce chunking for repack matmuls and enable matmul-id chunking on ARM64 (#16833)

Very similar implementation to the flash-attention chunking, with similar benefits.
ggml/src/ggml-cpu/ggml-cpu.c
ggml/src/ggml-cpu/repack.cpp