| 2026-03-11 |
Piotr Wilkin... | common/parser: handle reasoning budget (#20297) |
commit | commitdiff | tree |
| 2026-03-11 |
uvos | ggml-cuda: gdn use shared mem for HIP (#20366) |
commit | commitdiff | tree |
| 2026-03-11 |
uvos | cuda/hip: fix loop unrolling in ssm-conv (#20369) |
commit | commitdiff | tree |
| 2026-03-11 |
Pascal | Fix agentic mcp image single model (#20339) |
commit | commitdiff | tree |
| 2026-03-11 |
Alessandro... | vendor : update cpp-httplib to 0.37.0 (#20207) |
commit | commitdiff | tree |
| 2026-03-11 |
Alessandro... | vendor : update miniaudio to 0.11.25 (#20209) |
commit | commitdiff | tree |
| 2026-03-11 |
Neo Zhang | fix op rope, add rope_back (#20293) |
commit | commitdiff | tree |
| 2026-03-11 |
Neo Zhang | fix for failed UT case: ACC, L2_NORM, UPSCALE, fused_gl... |
commit | commitdiff | tree |
| 2026-03-10 |
Vinicios Lugli | model : qwen3vl reranker text support (#20332) |
commit | commitdiff | tree |
| 2026-03-10 |
ddh0 | llama-quant : correct `n_attention_wv` usage (#20357) |
commit | commitdiff | tree |
| 2026-03-10 |
Georgi Gerganov | ggml : bump RPC version (#20330) |
commit | commitdiff | tree |
| 2026-03-10 |
Reese Levine | ggml webgpu: faster normal quant and some k-quant matri... |
commit | commitdiff | tree |
| 2026-03-10 |
Piotr Wilkin... | Reduce level of content parser warning message to avoid... |
commit | commitdiff | tree |
| 2026-03-10 |
Ray Xu | examples : fix empty items in json_schema_to_grammar... |
commit | commitdiff | tree |
| 2026-03-10 |
a3894281 | docs: update CPU backend ops to mark POOL_1D as support... |
commit | commitdiff | tree |
| 2026-03-10 |
Georgi Gerganov | models : fix assert in mamba2 (cont) (#20335) |
commit | commitdiff | tree |
| 2026-03-10 |
Georgi Gerganov | server : make 2 checkpoints near the end of the prompt... |
commit | commitdiff | tree |
| 2026-03-10 |
Sigbjørn Skjæret | common : fix incorrect uses of stoul (#20313) |
commit | commitdiff | tree |
| 2026-03-10 |
Charles Xu | kleidiai : support for concurrent sme and neon kernel... |
commit | commitdiff | tree |
| 2026-03-10 |
Taimur Ahmad | ggml-cpu: add RVV repack GEMM and GEMV for quantization... |
commit | commitdiff | tree |
| 2026-03-10 |
Julian Pscheid | metal: handle command buffer failures gracefully in... |
commit | commitdiff | tree |
| 2026-03-10 |
ddh0 | llama-quant : fail early on missing imatrix, refactor... |
commit | commitdiff | tree |
| 2026-03-09 |
Aldehir Rojas | common: consolidate PEG string parsers (#20263) |
commit | commitdiff | tree |
| 2026-03-09 |
Xuan-Son Nguyen | model: fix step3.5 n_rot (#20318) |
commit | commitdiff | tree |
| 2026-03-09 |
Xuan-Son Nguyen | llama: dynamic head_dim and n_rot for SWA (#20301) |
commit | commitdiff | tree |
| 2026-03-09 |
Evan Huus | server: Parse port numbers from MCP server URLs in... |
commit | commitdiff | tree |
| 2026-03-09 |
Paul Flynn | metal : extend mul_mv_ext to BF16, Q2_K, Q3_K (#20250) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : fix checkpoints n_tokens calculation (#20287) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | metal : add upscale (#20284) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : warn swa-full is not supported for non-SWA... |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : fix off-by-1 in server_tokens::size_up_to_pos... |
commit | commitdiff | tree |
| 2026-03-09 |
Piotr Wilkin... | common: map developer role to system (#20215) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | models : fix assert in mamba2 graph (#20270) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : add kill switch when server is stuck (#20277) |
commit | commitdiff | tree |
| 2026-03-09 |
Aman Gupta | ggml-cuda: disable gdn for musa (#20278) |
commit | commitdiff | tree |
| 2026-03-09 |
ddh0 | llama-quant : left-align tensor names in output (#20117) |
commit | commitdiff | tree |
| 2026-03-09 |
Aman Gupta | contributing: limit open PRs for new contributors to... |
commit | commitdiff | tree |
| 2026-03-09 |
Bertay Eren | ggml-vulkan: add SGN operator, auto-generate Vulkan... |
commit | commitdiff | tree |
| 2026-03-09 |
Ruben Ortlam | vulkan: skip zero size tensors in backend copies (... |
commit | commitdiff | tree |
| 2026-03-09 |
Michael Huang | cuda : display total and free VRAM capacity during... |
commit | commitdiff | tree |
| 2026-03-09 |
Aaron Teo | llama-bench: introduce `-hf` and `-hff` flags & use... |
commit | commitdiff | tree |
| 2026-03-09 |
Piotr Wilkin... | PEG parser for LFM2 (#20251) |
commit | commitdiff | tree |
| 2026-03-08 |
Georgi Gerganov | server : do not create checkpoints right after mtmd... |
commit | commitdiff | tree |
| 2026-03-08 |
Sigbjørn Skjæret | graph : remove redundant scale_w parameter (#20235) |
commit | commitdiff | tree |
| 2026-03-08 |
Aldehir Rojas | common : gracefully handle incomplete output (#20191) |
commit | commitdiff | tree |
| 2026-03-08 |
Piotr Wilkin... | Fix compile bug (#20203) |
commit | commitdiff | tree |
| 2026-03-08 |
Piotr Wilkin... | Fix structured outputs (#20223) |
commit | commitdiff | tree |
| 2026-03-08 |
GiantPrince | ggml-vulkan: Add ELU op support (#20183) |
commit | commitdiff | tree |
| 2026-03-08 |
Jeff Bolz | vulkan: Fix data races in coopmat1 mul_mat(_id) (#20084) |
commit | commitdiff | tree |
| 2026-03-08 |
Johannes Gäßler | llama: end-to-end tests (#19802) |
commit | commitdiff | tree |
| 2026-03-08 |
Christopher... | readme : update infra list (#20212) |
commit | commitdiff | tree |
| 2026-03-08 |
Piotr Wilkin... | Revert to OAI-compatible args (#20213) |
commit | commitdiff | tree |
| 2026-03-08 |
decahedron1 | server : correct index on finish in OAI completion... |
commit | commitdiff | tree |
| 2026-03-08 |
Neo Zhang | [SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8... |
commit | commitdiff | tree |
| 2026-03-07 |
Aman Gupta | ggml: add GATED_DELTA_NET op (#19504) |
commit | commitdiff | tree |
| 2026-03-07 |
lhez | opencl: add l2_norm (#20160) |
commit | commitdiff | tree |
| 2026-03-07 |
Piotr Wilkin... | Autoparser: True streaming (#20177) |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Autoparser: add optional argument reshuffle capability... |
commit | commitdiff | tree |
| 2026-03-06 |
Bartowski | quants : Add memsets and other fixes for IQ quants... |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Add @pwilkin to CODEOWNERS for autoparser code (#20174) |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Autoparser - complete refactoring of parser architectur... |
commit | commitdiff | tree |
| 2026-03-06 |
Todor Boinovski | hexagon: add f32 ssm_conv op (#20122) |
commit | commitdiff | tree |
| 2026-03-06 |
Tom Vaucourt | server : preserve anthropic thinking blocks in conversi... |
commit | commitdiff | tree |
| 2026-03-06 |
Max Krasnyansky | cpu: skip redudant ROPE cache updates (#20149) |
commit | commitdiff | tree |
| 2026-03-06 |
Aman Gupta | ggml-cuda: add mem check for fusion (#19916) |
commit | commitdiff | tree |
| 2026-03-06 |
Aaron Teo | ggml: update comments for backends which have no memory... |
commit | commitdiff | tree |
| 2026-03-06 |
shalinib-ibm | ggml-cpu: Fix gcc 15 ICE on ppc64le (#20083) (#20130) |
commit | commitdiff | tree |
| 2026-03-06 |
Aman Gupta | CUDA: use shared mem for ssm_conv (#20128) |
commit | commitdiff | tree |
| 2026-03-06 |
Tim Neumann | context: ignore zero scale LoRAs when checking sameness... |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Checkpoint every n tokens: squash (#20087) |
commit | commitdiff | tree |
| 2026-03-06 |
Aleksander... | webui: Agentic Loop + MCP Client with support for Tools... |
commit | commitdiff | tree |
| 2026-03-06 |
Johannes Gäßler | ggml-cpu: fix data race for debug asserts (#20148) |
commit | commitdiff | tree |
| 2026-03-06 |
Georgi Gerganov | kv-cache : fix M-RoPE checkpoints (#20132) |
commit | commitdiff | tree |
| 2026-03-06 |
Roj234 | cli : Don't clear system prompt when using '/clear... |
commit | commitdiff | tree |
| 2026-03-06 |
lhez | opencl: add neg, exp and diag (#20127) |
commit | commitdiff | tree |
| 2026-03-06 |
YardenTal44 | hexagon: add fp16 support for binary ops: add,sub,mul... |
commit | commitdiff | tree |
| 2026-03-05 |
ymcki | models : kda chunk size = 16 (#19827) |
commit | commitdiff | tree |
| 2026-03-05 |
Andreas Kieslinger | CUDA: Improve performance via less synchronizations... |
commit | commitdiff | tree |
| 2026-03-05 |
Eric Zhang | model : update Qwen3.5 model type detection (#20126) |
commit | commitdiff | tree |
| 2026-03-05 |
Sigbjørn Skjæret | cli : add command and file auto-completion (#19985) |
commit | commitdiff | tree |
| 2026-03-05 |
Sigbjørn Skjæret | convert : register Qwen 3.5 ForCausalLM for text only... |
commit | commitdiff | tree |
| 2026-03-05 |
Aleksander... | webui: Improvements for Models Selector UI (#20066) |
commit | commitdiff | tree |
| 2026-03-05 |
Marcel Petrick | chore : correct typos [no ci] (#20041) |
commit | commitdiff | tree |
| 2026-03-05 |
Max Krasnyansky | hexagon: Flash Attention optimizations (dma, mpyacc... |
commit | commitdiff | tree |
| 2026-03-05 |
lhez | opencl: add `SET`, support i32 for `CPY`, minor refacto... |
commit | commitdiff | tree |
| 2026-03-04 |
Todor Boinovski | hexagon: add llama-completion runner script (#20095) |
commit | commitdiff | tree |
| 2026-03-04 |
Nikhil Jain | [WebGPU] Fix wait logic for inflight jobs (#20096) |
commit | commitdiff | tree |
| 2026-03-04 |
Masashi Yoshimura | Add concat op to webgpu. (#20068) |
commit | commitdiff | tree |
| 2026-03-04 |
Sigbjørn Skjæret | tools : add missing clocale include in mtmd-cli [no... |
commit | commitdiff | tree |
| 2026-03-04 |
Johannes Gäßler | ggml: fix ggml_is_contiguous_n for ne == 1 (#20092) |
commit | commitdiff | tree |
| 2026-03-04 |
Adrien Gallouët | ggml : use a simple std::thread in AMX without OpenMP... |
commit | commitdiff | tree |
| 2026-03-04 |
ddh0 | impl : use 6 digits for tensor dims (#20094) |
commit | commitdiff | tree |
| 2026-03-04 |
SamareshSingh | Fix locale-dependent float printing in GGUF metadata... |
commit | commitdiff | tree |
| 2026-03-04 |
standby24x7 | completion : Fix a typo in warning message (#20082) |
commit | commitdiff | tree |
| 2026-03-03 |
Mickael Desgranges | docs: Fix intel documentation link (#20040) |
commit | commitdiff | tree |
| 2026-03-03 |
Charles Xu | kleidiai : add sme fp16 compute path for q4_0 gemm... |
commit | commitdiff | tree |
| 2026-03-03 |
shaofeiqi | opencl: add optimized q4_1 mm kernel for adreno (#19840) |
commit | commitdiff | tree |
| 2026-03-03 |
Abhijit Ramesh | ggml webgpu: fix workgroup dispatch limit for large... |
commit | commitdiff | tree |
| 2026-03-02 |
Nikhil Jain | ggml webgpu: Clean up per-thread parameter buffer pool... |
commit | commitdiff | tree |
| 2026-03-02 |
Masashi Yoshimura | ggml-webgpu: Support non-contiguous `src0` and overlapp... |
commit | commitdiff | tree |
| next |