| 2026-03-12 |
Masato Nakasaka | ci: Setup self-hosted CI for Intel Linux Vulkan backend... |
commit | commitdiff | tree |
| 2026-03-12 |
Jeff Bolz | vulkan: fix l2_norm epsilon handling (#20350) |
commit | commitdiff | tree |
| 2026-03-12 |
Jeff Bolz | vulkan: fix OOB check in flash_attn_mask_opt (#20296) |
commit | commitdiff | tree |
| 2026-03-12 |
Masato Nakasaka | vulkan: Fix ErrorOutOfHostMemory on Intel GPU when... |
commit | commitdiff | tree |
| 2026-03-12 |
lhez | opencl: use larger workgroup size for get_rows (#20316) |
commit | commitdiff | tree |
| 2026-03-12 |
shaofeiqi | opencl: add cumsum op (#18981) |
commit | commitdiff | tree |
| 2026-03-12 |
uvos | hip: compile debug builds with -O2 on hip to avoid... |
commit | commitdiff | tree |
| 2026-03-12 |
Mishusha | common/parser: add GigaChatV3/3.1 models support (... |
commit | commitdiff | tree |
| 2026-03-11 |
DAN™ | model : add support for Phi4ForCausalLMV (#20168) |
commit | commitdiff | tree |
| 2026-03-11 |
Richard Davison | graph : add optional scale parameter to build_lora_mm... |
commit | commitdiff | tree |
| 2026-03-11 |
ddh0 | common : fix --n-cpu-moe, --cpu-moe for models with... |
commit | commitdiff | tree |
| 2026-03-11 |
Masashi Yoshimura | ggml-webgpu: Add supports for `GGML_OP_REPEAT` (#20230) |
commit | commitdiff | tree |
| 2026-03-11 |
Georgi Gerganov | llama : enable chunked fused GDN path (#20340) |
commit | commitdiff | tree |
| 2026-03-11 |
Sigbjørn Skjæret | llama : whitespace cleanup (#20422) |
commit | commitdiff | tree |
| 2026-03-11 |
Richard Davison | ggml : add NVFP4 quantization type support (#19769) |
commit | commitdiff | tree |
| 2026-03-11 |
Georgi Gerganov | benches : add nemotron super (#20420) |
commit | commitdiff | tree |
| 2026-03-11 |
Daniel Bevenius | llama : add support for Nemotron 3 Super (#20411) |
commit | commitdiff | tree |
| 2026-03-11 |
Georgi Gerganov | metal : fix capture_compute counter logic (#20410) |
commit | commitdiff | tree |
| 2026-03-11 |
Aman Gupta | compare-llama-bench: check remotes as well (#20406) |
commit | commitdiff | tree |
| 2026-03-11 |
Georgi Gerganov | metal : fix q5_k mul_mv register spill (#20399) |
commit | commitdiff | tree |
| 2026-03-11 |
Georgi Gerganov | metal : add env var to trigger graph capture (#20398) |
commit | commitdiff | tree |
| 2026-03-11 |
Neo Zhang | [SYCL] Update SYCL.md for binary package for Windows... |
commit | commitdiff | tree |
| 2026-03-11 |
Ruben Ortlam | ci: disable coopmat on ubuntu-24-cmake-vulkan job ... |
commit | commitdiff | tree |
| 2026-03-11 |
Aldehir Rojas | common/parser: use nlohmann::ordered_json to preserve... |
commit | commitdiff | tree |
| 2026-03-11 |
Piotr Wilkin... | common/parser: handle reasoning budget (#20297) |
commit | commitdiff | tree |
| 2026-03-11 |
uvos | ggml-cuda: gdn use shared mem for HIP (#20366) |
commit | commitdiff | tree |
| 2026-03-11 |
uvos | cuda/hip: fix loop unrolling in ssm-conv (#20369) |
commit | commitdiff | tree |
| 2026-03-11 |
Pascal | Fix agentic mcp image single model (#20339) |
commit | commitdiff | tree |
| 2026-03-11 |
Alessandro... | vendor : update cpp-httplib to 0.37.0 (#20207) |
commit | commitdiff | tree |
| 2026-03-11 |
Alessandro... | vendor : update miniaudio to 0.11.25 (#20209) |
commit | commitdiff | tree |
| 2026-03-11 |
Neo Zhang | fix op rope, add rope_back (#20293) |
commit | commitdiff | tree |
| 2026-03-11 |
Neo Zhang | fix for failed UT case: ACC, L2_NORM, UPSCALE, fused_gl... |
commit | commitdiff | tree |
| 2026-03-10 |
Vinicios Lugli | model : qwen3vl reranker text support (#20332) |
commit | commitdiff | tree |
| 2026-03-10 |
ddh0 | llama-quant : correct `n_attention_wv` usage (#20357) |
commit | commitdiff | tree |
| 2026-03-10 |
Georgi Gerganov | ggml : bump RPC version (#20330) |
commit | commitdiff | tree |
| 2026-03-10 |
Reese Levine | ggml webgpu: faster normal quant and some k-quant matri... |
commit | commitdiff | tree |
| 2026-03-10 |
Piotr Wilkin... | Reduce level of content parser warning message to avoid... |
commit | commitdiff | tree |
| 2026-03-10 |
Ray Xu | examples : fix empty items in json_schema_to_grammar... |
commit | commitdiff | tree |
| 2026-03-10 |
a3894281 | docs: update CPU backend ops to mark POOL_1D as support... |
commit | commitdiff | tree |
| 2026-03-10 |
Georgi Gerganov | models : fix assert in mamba2 (cont) (#20335) |
commit | commitdiff | tree |
| 2026-03-10 |
Georgi Gerganov | server : make 2 checkpoints near the end of the prompt... |
commit | commitdiff | tree |
| 2026-03-10 |
Sigbjørn Skjæret | common : fix incorrect uses of stoul (#20313) |
commit | commitdiff | tree |
| 2026-03-10 |
Charles Xu | kleidiai : support for concurrent sme and neon kernel... |
commit | commitdiff | tree |
| 2026-03-10 |
Taimur Ahmad | ggml-cpu: add RVV repack GEMM and GEMV for quantization... |
commit | commitdiff | tree |
| 2026-03-10 |
Julian Pscheid | metal: handle command buffer failures gracefully in... |
commit | commitdiff | tree |
| 2026-03-10 |
ddh0 | llama-quant : fail early on missing imatrix, refactor... |
commit | commitdiff | tree |
| 2026-03-09 |
Aldehir Rojas | common: consolidate PEG string parsers (#20263) |
commit | commitdiff | tree |
| 2026-03-09 |
Xuan-Son Nguyen | model: fix step3.5 n_rot (#20318) |
commit | commitdiff | tree |
| 2026-03-09 |
Xuan-Son Nguyen | llama: dynamic head_dim and n_rot for SWA (#20301) |
commit | commitdiff | tree |
| 2026-03-09 |
Evan Huus | server: Parse port numbers from MCP server URLs in... |
commit | commitdiff | tree |
| 2026-03-09 |
Paul Flynn | metal : extend mul_mv_ext to BF16, Q2_K, Q3_K (#20250) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : fix checkpoints n_tokens calculation (#20287) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | metal : add upscale (#20284) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : warn swa-full is not supported for non-SWA... |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : fix off-by-1 in server_tokens::size_up_to_pos... |
commit | commitdiff | tree |
| 2026-03-09 |
Piotr Wilkin... | common: map developer role to system (#20215) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | models : fix assert in mamba2 graph (#20270) |
commit | commitdiff | tree |
| 2026-03-09 |
Georgi Gerganov | server : add kill switch when server is stuck (#20277) |
commit | commitdiff | tree |
| 2026-03-09 |
Aman Gupta | ggml-cuda: disable gdn for musa (#20278) |
commit | commitdiff | tree |
| 2026-03-09 |
ddh0 | llama-quant : left-align tensor names in output (#20117) |
commit | commitdiff | tree |
| 2026-03-09 |
Aman Gupta | contributing: limit open PRs for new contributors to... |
commit | commitdiff | tree |
| 2026-03-09 |
Bertay Eren | ggml-vulkan: add SGN operator, auto-generate Vulkan... |
commit | commitdiff | tree |
| 2026-03-09 |
Ruben Ortlam | vulkan: skip zero size tensors in backend copies (... |
commit | commitdiff | tree |
| 2026-03-09 |
Michael Huang | cuda : display total and free VRAM capacity during... |
commit | commitdiff | tree |
| 2026-03-09 |
Aaron Teo | llama-bench: introduce `-hf` and `-hff` flags & use... |
commit | commitdiff | tree |
| 2026-03-09 |
Piotr Wilkin... | PEG parser for LFM2 (#20251) |
commit | commitdiff | tree |
| 2026-03-08 |
Georgi Gerganov | server : do not create checkpoints right after mtmd... |
commit | commitdiff | tree |
| 2026-03-08 |
Sigbjørn Skjæret | graph : remove redundant scale_w parameter (#20235) |
commit | commitdiff | tree |
| 2026-03-08 |
Aldehir Rojas | common : gracefully handle incomplete output (#20191) |
commit | commitdiff | tree |
| 2026-03-08 |
Piotr Wilkin... | Fix compile bug (#20203) |
commit | commitdiff | tree |
| 2026-03-08 |
Piotr Wilkin... | Fix structured outputs (#20223) |
commit | commitdiff | tree |
| 2026-03-08 |
GiantPrince | ggml-vulkan: Add ELU op support (#20183) |
commit | commitdiff | tree |
| 2026-03-08 |
Jeff Bolz | vulkan: Fix data races in coopmat1 mul_mat(_id) (#20084) |
commit | commitdiff | tree |
| 2026-03-08 |
Johannes Gäßler | llama: end-to-end tests (#19802) |
commit | commitdiff | tree |
| 2026-03-08 |
Christopher... | readme : update infra list (#20212) |
commit | commitdiff | tree |
| 2026-03-08 |
Piotr Wilkin... | Revert to OAI-compatible args (#20213) |
commit | commitdiff | tree |
| 2026-03-08 |
decahedron1 | server : correct index on finish in OAI completion... |
commit | commitdiff | tree |
| 2026-03-08 |
Neo Zhang | [SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8... |
commit | commitdiff | tree |
| 2026-03-07 |
Aman Gupta | ggml: add GATED_DELTA_NET op (#19504) |
commit | commitdiff | tree |
| 2026-03-07 |
lhez | opencl: add l2_norm (#20160) |
commit | commitdiff | tree |
| 2026-03-07 |
Piotr Wilkin... | Autoparser: True streaming (#20177) |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Autoparser: add optional argument reshuffle capability... |
commit | commitdiff | tree |
| 2026-03-06 |
Bartowski | quants : Add memsets and other fixes for IQ quants... |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Add @pwilkin to CODEOWNERS for autoparser code (#20174) |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Autoparser - complete refactoring of parser architectur... |
commit | commitdiff | tree |
| 2026-03-06 |
Todor Boinovski | hexagon: add f32 ssm_conv op (#20122) |
commit | commitdiff | tree |
| 2026-03-06 |
Tom Vaucourt | server : preserve anthropic thinking blocks in conversi... |
commit | commitdiff | tree |
| 2026-03-06 |
Max Krasnyansky | cpu: skip redudant ROPE cache updates (#20149) |
commit | commitdiff | tree |
| 2026-03-06 |
Aman Gupta | ggml-cuda: add mem check for fusion (#19916) |
commit | commitdiff | tree |
| 2026-03-06 |
Aaron Teo | ggml: update comments for backends which have no memory... |
commit | commitdiff | tree |
| 2026-03-06 |
shalinib-ibm | ggml-cpu: Fix gcc 15 ICE on ppc64le (#20083) (#20130) |
commit | commitdiff | tree |
| 2026-03-06 |
Aman Gupta | CUDA: use shared mem for ssm_conv (#20128) |
commit | commitdiff | tree |
| 2026-03-06 |
Tim Neumann | context: ignore zero scale LoRAs when checking sameness... |
commit | commitdiff | tree |
| 2026-03-06 |
Piotr Wilkin... | Checkpoint every n tokens: squash (#20087) |
commit | commitdiff | tree |
| 2026-03-06 |
Aleksander... | webui: Agentic Loop + MCP Client with support for Tools... |
commit | commitdiff | tree |
| 2026-03-06 |
Johannes Gäßler | ggml-cpu: fix data race for debug asserts (#20148) |
commit | commitdiff | tree |
| 2026-03-06 |
Georgi Gerganov | kv-cache : fix M-RoPE checkpoints (#20132) |
commit | commitdiff | tree |
| 2026-03-06 |
Roj234 | cli : Don't clear system prompt when using '/clear... |
commit | commitdiff | tree |
| 2026-03-06 |
lhez | opencl: add neg, exp and diag (#20127) |
commit | commitdiff | tree |
| 2026-03-06 |
YardenTal44 | hexagon: add fp16 support for binary ops: add,sub,mul... |
commit | commitdiff | tree |
| next |