| 2026-01-25 |
Jakkala Mahesh | llama: fix integer type consistency in split helpers... |
commit | commitdiff | tree |
| 2026-01-25 |
Daniel Bevenius | common : use two decimal places for float arg help... |
commit | commitdiff | tree |
| 2026-01-25 |
Bartowski | convert : fix conversion for inheriting models that... |
commit | commitdiff | tree |
| 2026-01-24 |
Johannes Gäßler | llama-fit-params: keep explicit --ctx-size 0 (#19070) |
commit | commitdiff | tree |
| 2026-01-24 |
Johannes Gäßler | GGUF: check that tensor size is representable (#19072) |
commit | commitdiff | tree |
| 2026-01-24 |
Xuan-Son Nguyen | chat: fix language input for translategemma (#19052) |
commit | commitdiff | tree |
| 2026-01-24 |
Johannes Gäßler | CUDA: re-use MLA K data for V in MMA FA (#19057) |
commit | commitdiff | tree |
| 2026-01-24 |
Aman Gupta | ggml-cuda: enable cuda-graphs for `n-cpu-moe` (#18934) |
commit | commitdiff | tree |
| 2026-01-24 |
nullname | ggml-hexagon: flash-attn opt (#19025) |
commit | commitdiff | tree |
| 2026-01-23 |
Georgi Gerganov | graph : utilize `ggml_build_forward_select()` to avoid... |
commit | commitdiff | tree |
| 2026-01-23 |
Neo Zhang | [SYCL] use malloc to support both iGPU and dGPU in... |
commit | commitdiff | tree |
| 2026-01-23 |
Xuan-Son Nguyen | chat : fix translategemma crash on common_chat_format_e... |
commit | commitdiff | tree |
| 2026-01-23 |
Daniel Bevenius | model-conversion : use BUILD_DIR variable in all script... |
commit | commitdiff | tree |
| 2026-01-23 |
Alberto Cabrera... | ggml-cpu: aarm64: q5_K repack gemm and gemv (and generi... |
commit | commitdiff | tree |
| 2026-01-23 |
Aldehir Rojas | cli : load parser definition (#19031) |
commit | commitdiff | tree |
| 2026-01-22 |
Xuan-Son Nguyen | server : support preserving reasoning_content in assist... |
commit | commitdiff | tree |
| 2026-01-22 |
Georgi Gerganov | mla : make the V tensor a view of K (#18986) |
commit | commitdiff | tree |
| 2026-01-22 |
Johannes Gäßler | CUDA: fix alignment check for FA (#19023) |
commit | commitdiff | tree |
| 2026-01-22 |
Aman Gupta | convert_hf_to_gguf.py: refactor modify_tensors to call... |
commit | commitdiff | tree |
| 2026-01-22 |
lhez | opencl: enable the general fp mm for non-cont input... |
commit | commitdiff | tree |
| 2026-01-22 |
Xuan-Son Nguyen | server: do not log certain endpoints (avoid log spam... |
commit | commitdiff | tree |
| 2026-01-22 |
Georgi Gerganov | quant : manual overrides of tensor types take precedenc... |
commit | commitdiff | tree |
| 2026-01-22 |
Aaron Teo | release: update github api (#19022) |
commit | commitdiff | tree |
| 2026-01-22 |
Xuan-Son Nguyen | mtmd : update docs to use llama_model_n_embd_inp (... |
commit | commitdiff | tree |
| 2026-01-22 |
손희준 | server: Reorder methods in `server-task.cpp` (#19016) |
commit | commitdiff | tree |
| 2026-01-22 |
Aman Gupta | CUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953) |
commit | commitdiff | tree |
| 2026-01-22 |
shaofeiqi | opencl: add TRI op support (#18979) |
commit | commitdiff | tree |
| 2026-01-22 |
Aleksei Nikiforov | ggml-zdnn : mark zDNN buffers as non-host (#18967) |
commit | commitdiff | tree |
| 2026-01-21 |
Pádraic Slattery | ci : update GitHub Actions versions [no ci] (#18935) |
commit | commitdiff | tree |
| 2026-01-21 |
Mariusz Woloszyn | convert : add Devstral-2 (Ministral3ForCausalLM) arch... |
commit | commitdiff | tree |
| 2026-01-21 |
Piotr Wilkin... | jinja: support none|string (#18995) |
commit | commitdiff | tree |
| 2026-01-21 |
Hendrik Erz | fix: Use `tabular-nums` for chat message statistics... |
commit | commitdiff | tree |
| 2026-01-21 |
Daniel Bevenius | llama : clarify nemotron-h.cpp comment about RoPE ... |
commit | commitdiff | tree |
| 2026-01-21 |
Jeff Bolz | vulkan: Remove transfer_ctx, do everything in compute_c... |
commit | commitdiff | tree |
| 2026-01-21 |
Adrien Gallouët | common : improve error message when HTTPS is missing... |
commit | commitdiff | tree |
| 2026-01-21 |
손희준 | server: /v1/responses (partial) (#18486) |
commit | commitdiff | tree |
| 2026-01-21 |
Jeff Bolz | vulkan: support flash attention GQA/split_k with small... |
commit | commitdiff | tree |
| 2026-01-21 |
Masato Nakasaka | Revert "vulkan: force full subgroups for flash attentio... |
commit | commitdiff | tree |
| 2026-01-21 |
Jeff Bolz | vulkan: Use mul_mat_vec_id for small values of n (... |
commit | commitdiff | tree |
| 2026-01-21 |
Tarek Dakhran | memory : add llama_memory_hybrid_iswa (#18601) |
commit | commitdiff | tree |
| 2026-01-21 |
Piotr Wilkin... | Fix GLM 4.7 Lite MoE gating func (#18980) |
commit | commitdiff | tree |
| 2026-01-21 |
Matthieu Coudron | gguf: display strerrno when cant load a model (#18884) |
commit | commitdiff | tree |
| 2026-01-21 |
Oliver Simons | CUDA: Fix builds for older CCCL versions by ifdefing... |
commit | commitdiff | tree |
| 2026-01-20 |
Adrien Gallouët | common, server : use the same User-Agent by default... |
commit | commitdiff | tree |
| 2026-01-20 |
Xuan-Son Nguyen | cli : fix reasoning responses in CLI (#18961) |
commit | commitdiff | tree |
| 2026-01-20 |
Oliver Simons | CUDA: Replace init_offsets kernel with iterators in... |
commit | commitdiff | tree |
| 2026-01-20 |
Adrien Gallouët | ggml : cleanup path_str() (#18928) |
commit | commitdiff | tree |
| 2026-01-20 |
Georgi Gerganov | metal : enable FA for MLA heads (#18950) |
commit | commitdiff | tree |
| 2026-01-20 |
Daniel Bevenius | convert : use n_groups instead of hardcoded values... |
commit | commitdiff | tree |
| 2026-01-19 |
Xuan-Son Nguyen | server : refactor oai_parser_opt, move it to server_cha... |
commit | commitdiff | tree |
| 2026-01-19 |
ddh0 | convert : support Glm4MoeLite (#18936) |
commit | commitdiff | tree |
| 2026-01-19 |
Sigbjørn Skjæret | jinja : fix undefined keys and attributes and int/float... |
commit | commitdiff | tree |
| 2026-01-19 |
Sigbjørn Skjæret | ci : run test-jinja -py on high perf [no ci] (#18916) |
commit | commitdiff | tree |
| 2026-01-19 |
Lennart Austenfeld | server: fix memory reservations in populate_token_probs... |
commit | commitdiff | tree |
| 2026-01-19 |
Georgi Gerganov | ggml : add ggml_build_forward_select (#18550) |
commit | commitdiff | tree |
| 2026-01-19 |
Daniel Bevenius | model-conversion : add BUILD_DIR variable to run-conver... |
commit | commitdiff | tree |
| 2026-01-18 |
Julius Tischbein | llama : Extend fallback, fix fileno for dio file, exclu... |
commit | commitdiff | tree |
| 2026-01-18 |
Francisco Herrera | docs: add linux to index (#18907) |
commit | commitdiff | tree |
| 2026-01-18 |
Xuan-Son Nguyen | tests : add test-jinja -py option for cross-checking... |
commit | commitdiff | tree |
| 2026-01-18 |
Sigbjørn Skjæret | jinja : fix object item order (and properly implement... |
commit | commitdiff | tree |
| 2026-01-18 |
Sigbjørn Skjæret | jinja : attribute support for join, map and sort (... |
commit | commitdiff | tree |
| 2026-01-18 |
Sigbjørn Skjæret | jinja : add missing tojson filter for bool (#18900) |
commit | commitdiff | tree |
| 2026-01-17 |
Sigbjørn Skjæret | jinja : fix lexing of float literals with sign (#18901) |
commit | commitdiff | tree |
| 2026-01-17 |
Xuan-Son Nguyen | jinja: correct member access rule (#18905) |
commit | commitdiff | tree |
| 2026-01-17 |
lhez | opencl: fix q6_K mv for m=1 (#18893) |
commit | commitdiff | tree |
| 2026-01-17 |
Sigbjørn Skjæret | ci : add label for jinja changes (#18903) |
commit | commitdiff | tree |
| 2026-01-17 |
Georgi Gerganov | kv-cache : optimize KQ mask construction (#18842) |
commit | commitdiff | tree |
| 2026-01-17 |
Reese Levine | ggml webgpu: support for backend sampling (#18880) |
commit | commitdiff | tree |
| 2026-01-16 |
Thore Koritzius | ggml : extend ggml_pool_1d + metal (#16429) |
commit | commitdiff | tree |
| 2026-01-16 |
hipudding | docs : update ops.md for CANN backend (#18654) |
commit | commitdiff | tree |
| 2026-01-16 |
Perry Naseck | ggml-blas: hide warnings from included BLAS headers... |
commit | commitdiff | tree |
| 2026-01-16 |
Tarek Dakhran | mtmd : Fix ASR for LFM2.5-Audio-1.5B (#18876) |
commit | commitdiff | tree |
| 2026-01-16 |
Xuan-Son Nguyen | common : implement new jinja template engine (#18462) |
commit | commitdiff | tree |
| 2026-01-16 |
Julius Tischbein | Setting mmap and direct_io to false as default in llama... |
commit | commitdiff | tree |
| 2026-01-16 |
Raul Torres | CANN: Remove unused `ggml_cann_get_device` function... |
commit | commitdiff | tree |
| 2026-01-16 |
Chenguang Li | CANN: fix an issue where get_env was not fully renamed... |
commit | commitdiff | tree |
| 2026-01-16 |
hipudding | CANN: support gated linear attn (#18653) |
commit | commitdiff | tree |
| 2026-01-15 |
shaofeiqi | OpenCL: add SOLVE_TRI op support (#18846) |
commit | commitdiff | tree |
| 2026-01-15 |
Georgi Gerganov | cuda : print less debug logs when disabling cuda graphs... |
commit | commitdiff | tree |
| 2026-01-15 |
Georgi Gerganov | context : do not reserve scheduler for warmups (#18867) |
commit | commitdiff | tree |
| 2026-01-15 |
ddh0 | llama : add adaptive-p sampler (#17927) |
commit | commitdiff | tree |
| 2026-01-15 |
Xuan-Son Nguyen | server: improve slots scheduling for n_cmpl (#18789) |
commit | commitdiff | tree |
| 2026-01-15 |
Georgi Gerganov | context : reserve new scheduler when graph topology... |
commit | commitdiff | tree |
| 2026-01-15 |
Johannes Gäßler | CUDA: fix allignment on register spill for FA (#18815) |
commit | commitdiff | tree |
| 2026-01-15 |
shalinib-ibm | ggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837) |
commit | commitdiff | tree |
| 2026-01-15 |
Xuan-Son Nguyen | lora: make sure model keep track of associated adapters... |
commit | commitdiff | tree |
| 2026-01-15 |
Sigbjørn Skjæret | model-loader : support bool array sliding window patter... |
commit | commitdiff | tree |
| 2026-01-15 |
Adrien Gallouët | tests : download models only when running ctest (#18843) |
commit | commitdiff | tree |
| 2026-01-15 |
Max Krasnyansky | hexagon: support for OP_CPY, host buffers now optional... |
commit | commitdiff | tree |
| 2026-01-15 |
Oliver Simons | CUDA: Factor out and re-use `block_reduce` function... |
commit | commitdiff | tree |
| 2026-01-14 |
Piotr Wilkin... | Restore clip's cb() to its rightful glory - extract... |
commit | commitdiff | tree |
| 2026-01-14 |
Junwon Hwang | model : clean up and fix EXAONE-MoE configuration ... |
commit | commitdiff | tree |
| 2026-01-14 |
Adrien Gallouët | refactor : remove libcurl, use OpenSSL when available... |
commit | commitdiff | tree |
| 2026-01-14 |
Jeff Bolz | vulkan: Check maxStorageBufferRange in supports_op... |
commit | commitdiff | tree |
| 2026-01-14 |
Aman Gupta | llama-model: fix unfortunate typo (#18832) |
commit | commitdiff | tree |
| 2026-01-14 |
Daniel Bevenius | CUDA : fix typo in clang pragma comment [no ci] (#18830) |
commit | commitdiff | tree |
| 2026-01-14 |
Ruben Ortlam | vulkan: work around Intel fp16 bug in mmq (#18814) |
commit | commitdiff | tree |
| 2026-01-14 |
Perry Naseck | ggml-metal: do not copy headers for embedded, use curre... |
commit | commitdiff | tree |
| 2026-01-14 |
Daniel Benjaminsson | mmap: add Haiku support by skipping RLIMIT_MEMLOCK... |
commit | commitdiff | tree |
| 2026-01-14 |
Adrien Gallouët | ci, tests : use cmake to download models and remove... |
commit | commitdiff | tree |
| next |