| 2026-01-22 |
Aleksei Nikiforov | ggml-zdnn : mark zDNN buffers as non-host (#18967) |
commit | commitdiff | tree |
| 2026-01-21 |
Pádraic Slattery | ci : update GitHub Actions versions [no ci] (#18935) |
commit | commitdiff | tree |
| 2026-01-21 |
Mariusz Woloszyn | convert : add Devstral-2 (Ministral3ForCausalLM) arch... |
commit | commitdiff | tree |
| 2026-01-21 |
Piotr Wilkin... | jinja: support none|string (#18995) |
commit | commitdiff | tree |
| 2026-01-21 |
Hendrik Erz | fix: Use `tabular-nums` for chat message statistics... |
commit | commitdiff | tree |
| 2026-01-21 |
Daniel Bevenius | llama : clarify nemotron-h.cpp comment about RoPE ... |
commit | commitdiff | tree |
| 2026-01-21 |
Jeff Bolz | vulkan: Remove transfer_ctx, do everything in compute_c... |
commit | commitdiff | tree |
| 2026-01-21 |
Adrien Gallouët | common : improve error message when HTTPS is missing... |
commit | commitdiff | tree |
| 2026-01-21 |
손희준 | server: /v1/responses (partial) (#18486) |
commit | commitdiff | tree |
| 2026-01-21 |
Jeff Bolz | vulkan: support flash attention GQA/split_k with small... |
commit | commitdiff | tree |
| 2026-01-21 |
Masato Nakasaka | Revert "vulkan: force full subgroups for flash attentio... |
commit | commitdiff | tree |
| 2026-01-21 |
Jeff Bolz | vulkan: Use mul_mat_vec_id for small values of n (... |
commit | commitdiff | tree |
| 2026-01-21 |
Tarek Dakhran | memory : add llama_memory_hybrid_iswa (#18601) |
commit | commitdiff | tree |
| 2026-01-21 |
Piotr Wilkin... | Fix GLM 4.7 Lite MoE gating func (#18980) |
commit | commitdiff | tree |
| 2026-01-21 |
Matthieu Coudron | gguf: display strerrno when cant load a model (#18884) |
commit | commitdiff | tree |
| 2026-01-21 |
Oliver Simons | CUDA: Fix builds for older CCCL versions by ifdefing... |
commit | commitdiff | tree |
| 2026-01-20 |
Adrien Gallouët | common, server : use the same User-Agent by default... |
commit | commitdiff | tree |
| 2026-01-20 |
Xuan-Son Nguyen | cli : fix reasoning responses in CLI (#18961) |
commit | commitdiff | tree |
| 2026-01-20 |
Oliver Simons | CUDA: Replace init_offsets kernel with iterators in... |
commit | commitdiff | tree |
| 2026-01-20 |
Adrien Gallouët | ggml : cleanup path_str() (#18928) |
commit | commitdiff | tree |
| 2026-01-20 |
Georgi Gerganov | metal : enable FA for MLA heads (#18950) |
commit | commitdiff | tree |
| 2026-01-20 |
Daniel Bevenius | convert : use n_groups instead of hardcoded values... |
commit | commitdiff | tree |
| 2026-01-19 |
Xuan-Son Nguyen | server : refactor oai_parser_opt, move it to server_cha... |
commit | commitdiff | tree |
| 2026-01-19 |
ddh0 | convert : support Glm4MoeLite (#18936) |
commit | commitdiff | tree |
| 2026-01-19 |
Sigbjørn Skjæret | jinja : fix undefined keys and attributes and int/float... |
commit | commitdiff | tree |
| 2026-01-19 |
Sigbjørn Skjæret | ci : run test-jinja -py on high perf [no ci] (#18916) |
commit | commitdiff | tree |
| 2026-01-19 |
Lennart Austenfeld | server: fix memory reservations in populate_token_probs... |
commit | commitdiff | tree |
| 2026-01-19 |
Georgi Gerganov | ggml : add ggml_build_forward_select (#18550) |
commit | commitdiff | tree |
| 2026-01-19 |
Daniel Bevenius | model-conversion : add BUILD_DIR variable to run-conver... |
commit | commitdiff | tree |
| 2026-01-18 |
Julius Tischbein | llama : Extend fallback, fix fileno for dio file, exclu... |
commit | commitdiff | tree |
| 2026-01-18 |
Francisco Herrera | docs: add linux to index (#18907) |
commit | commitdiff | tree |
| 2026-01-18 |
Xuan-Son Nguyen | tests : add test-jinja -py option for cross-checking... |
commit | commitdiff | tree |
| 2026-01-18 |
Sigbjørn Skjæret | jinja : fix object item order (and properly implement... |
commit | commitdiff | tree |
| 2026-01-18 |
Sigbjørn Skjæret | jinja : attribute support for join, map and sort (... |
commit | commitdiff | tree |
| 2026-01-18 |
Sigbjørn Skjæret | jinja : add missing tojson filter for bool (#18900) |
commit | commitdiff | tree |
| 2026-01-17 |
Sigbjørn Skjæret | jinja : fix lexing of float literals with sign (#18901) |
commit | commitdiff | tree |
| 2026-01-17 |
Xuan-Son Nguyen | jinja: correct member access rule (#18905) |
commit | commitdiff | tree |
| 2026-01-17 |
lhez | opencl: fix q6_K mv for m=1 (#18893) |
commit | commitdiff | tree |
| 2026-01-17 |
Sigbjørn Skjæret | ci : add label for jinja changes (#18903) |
commit | commitdiff | tree |
| 2026-01-17 |
Georgi Gerganov | kv-cache : optimize KQ mask construction (#18842) |
commit | commitdiff | tree |
| 2026-01-17 |
Reese Levine | ggml webgpu: support for backend sampling (#18880) |
commit | commitdiff | tree |
| 2026-01-16 |
Thore Koritzius | ggml : extend ggml_pool_1d + metal (#16429) |
commit | commitdiff | tree |
| 2026-01-16 |
hipudding | docs : update ops.md for CANN backend (#18654) |
commit | commitdiff | tree |
| 2026-01-16 |
Perry Naseck | ggml-blas: hide warnings from included BLAS headers... |
commit | commitdiff | tree |
| 2026-01-16 |
Tarek Dakhran | mtmd : Fix ASR for LFM2.5-Audio-1.5B (#18876) |
commit | commitdiff | tree |
| 2026-01-16 |
Xuan-Son Nguyen | common : implement new jinja template engine (#18462) |
commit | commitdiff | tree |
| 2026-01-16 |
Julius Tischbein | Setting mmap and direct_io to false as default in llama... |
commit | commitdiff | tree |
| 2026-01-16 |
Raul Torres | CANN: Remove unused `ggml_cann_get_device` function... |
commit | commitdiff | tree |
| 2026-01-16 |
Chenguang Li | CANN: fix an issue where get_env was not fully renamed... |
commit | commitdiff | tree |
| 2026-01-16 |
hipudding | CANN: support gated linear attn (#18653) |
commit | commitdiff | tree |
| 2026-01-15 |
shaofeiqi | OpenCL: add SOLVE_TRI op support (#18846) |
commit | commitdiff | tree |
| 2026-01-15 |
Georgi Gerganov | cuda : print less debug logs when disabling cuda graphs... |
commit | commitdiff | tree |
| 2026-01-15 |
Georgi Gerganov | context : do not reserve scheduler for warmups (#18867) |
commit | commitdiff | tree |
| 2026-01-15 |
ddh0 | llama : add adaptive-p sampler (#17927) |
commit | commitdiff | tree |
| 2026-01-15 |
Xuan-Son Nguyen | server: improve slots scheduling for n_cmpl (#18789) |
commit | commitdiff | tree |
| 2026-01-15 |
Georgi Gerganov | context : reserve new scheduler when graph topology... |
commit | commitdiff | tree |
| 2026-01-15 |
Johannes Gäßler | CUDA: fix allignment on register spill for FA (#18815) |
commit | commitdiff | tree |
| 2026-01-15 |
shalinib-ibm | ggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837) |
commit | commitdiff | tree |
| 2026-01-15 |
Xuan-Son Nguyen | lora: make sure model keep track of associated adapters... |
commit | commitdiff | tree |
| 2026-01-15 |
Sigbjørn Skjæret | model-loader : support bool array sliding window patter... |
commit | commitdiff | tree |
| 2026-01-15 |
Adrien Gallouët | tests : download models only when running ctest (#18843) |
commit | commitdiff | tree |
| 2026-01-15 |
Max Krasnyansky | hexagon: support for OP_CPY, host buffers now optional... |
commit | commitdiff | tree |
| 2026-01-15 |
Oliver Simons | CUDA: Factor out and re-use `block_reduce` function... |
commit | commitdiff | tree |
| 2026-01-14 |
Piotr Wilkin... | Restore clip's cb() to its rightful glory - extract... |
commit | commitdiff | tree |
| 2026-01-14 |
Junwon Hwang | model : clean up and fix EXAONE-MoE configuration ... |
commit | commitdiff | tree |
| 2026-01-14 |
Adrien Gallouët | refactor : remove libcurl, use OpenSSL when available... |
commit | commitdiff | tree |
| 2026-01-14 |
Jeff Bolz | vulkan: Check maxStorageBufferRange in supports_op... |
commit | commitdiff | tree |
| 2026-01-14 |
Aman Gupta | llama-model: fix unfortunate typo (#18832) |
commit | commitdiff | tree |
| 2026-01-14 |
Daniel Bevenius | CUDA : fix typo in clang pragma comment [no ci] (#18830) |
commit | commitdiff | tree |
| 2026-01-14 |
Ruben Ortlam | vulkan: work around Intel fp16 bug in mmq (#18814) |
commit | commitdiff | tree |
| 2026-01-14 |
Perry Naseck | ggml-metal: do not copy headers for embedded, use curre... |
commit | commitdiff | tree |
| 2026-01-14 |
Daniel Benjaminsson | mmap: add Haiku support by skipping RLIMIT_MEMLOCK... |
commit | commitdiff | tree |
| 2026-01-14 |
Adrien Gallouët | ci, tests : use cmake to download models and remove... |
commit | commitdiff | tree |
| 2026-01-13 |
ddh0 | llama : print_info alignment fix (#18708) |
commit | commitdiff | tree |
| 2026-01-13 |
Junwon Hwang | model : add EXAONE MoE (#18543) |
commit | commitdiff | tree |
| 2026-01-13 |
Georgi Gerganov | vocab : fix attribute overrides for harmony (#18806) |
commit | commitdiff | tree |
| 2026-01-13 |
Ruben Ortlam | llama-mmap: fix direct-io loading fallback EOF exceptio... |
commit | commitdiff | tree |
| 2026-01-13 |
Daniel Bevenius | model-conversion : remove -c 0 from model card template... |
commit | commitdiff | tree |
| 2026-01-13 |
yulo | HIP: add fattn-mma-f16 for RDNA4 (#18481) |
commit | commitdiff | tree |
| 2026-01-13 |
Johannes Gäßler | doc: ban AI-generated PR descriptions [no ci] (#18765) |
commit | commitdiff | tree |
| 2026-01-13 |
Xuan-Son Nguyen | mtmd: fix use_non_causal being reported incorrectly... upstream/0.0.7721 |
commit | commitdiff | tree |
| 2026-01-13 |
Georgi Gerganov | CUDA : fix unused argument when USE_CUDA_GRAPH=OFF... |
commit | commitdiff | tree |
| 2026-01-13 |
Gabe Goodhart | graph : clean up t5 input builders (#18795) |
commit | commitdiff | tree |
| 2026-01-13 |
Ruben Ortlam | llama-bench: add direct_io parameter (#18778) |
commit | commitdiff | tree |
| 2026-01-12 |
Adrien Gallouët | ci : remove libcurl in releases (#18775) |
commit | commitdiff | tree |
| 2026-01-12 |
Radoslav Gerganov | server : add arg for disabling prompt caching (#18776) |
commit | commitdiff | tree |
| 2026-01-12 |
Adrien Gallouët | ci : use openssl for openEuler-latest-cmake-cann (... |
commit | commitdiff | tree |
| 2026-01-12 |
Adrien Gallouët | vendor : update cpp-httplib to 0.30.1 (#18771) |
commit | commitdiff | tree |
| 2026-01-12 |
Daniel Bevenius | examples : add --kv-unified to batched example (#18774) |
commit | commitdiff | tree |
| 2026-01-12 |
Jeff Bolz | vulkan: change memory_logger to be controlled by an... |
commit | commitdiff | tree |
| 2026-01-12 |
Xuan-Son Nguyen | server: update docs for sleeping [no ci] (#18777) |
commit | commitdiff | tree |
| 2026-01-12 |
Jeff Bolz | vulkan: Use VK_EXT_shader_64bit_indexing to handle... |
commit | commitdiff | tree |
| 2026-01-12 |
Ruben Ortlam | vulkan: Disable large coopmat matmul configuration... |
commit | commitdiff | tree |
| 2026-01-11 |
Xuan-Son Nguyen | model: fix qwen3next broken due to #18683 (#18762) |
commit | commitdiff | tree |
| 2026-01-11 |
Ruben Ortlam | Vulkan: Optimize Matmul parameters for AMD GPUs with... |
commit | commitdiff | tree |
| 2026-01-11 |
Xuan-Son Nguyen | security: make it clear about subtopics in server ... |
commit | commitdiff | tree |
| 2026-01-11 |
Daniel Bevenius | debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooli... |
commit | commitdiff | tree |
| 2026-01-11 |
Georgi Gerganov | tests : refactor test-backend-sampler (#18753) |
commit | commitdiff | tree |
| 2026-01-11 |
Xuan-Son Nguyen | model: try to improve Qwen3 Next (#18683) |
commit | commitdiff | tree |
| 2026-01-11 |
thom-dev-fr | readme : update UIs (#18751) |
commit | commitdiff | tree |
| next |