| 2026-02-06 |
ymcki | Kimi-Linear support (backend agnostic + MLA KV cache... |
commit | commitdiff | tree |
| 2026-02-06 |
Jeff Bolz | vulkan: For coopmat2 FA, use fp16 accumulators for... |
commit | commitdiff | tree |
| 2026-02-06 |
Jeff Bolz | vulkan: make FA mask/softcap enables spec constants... |
commit | commitdiff | tree |
| 2026-02-06 |
Georgi Gerganov | metal : skip loading all-zero mask (#19337) |
commit | commitdiff | tree |
| 2026-02-06 |
Daniel Bevenius | llama : rename llama-sampling to llama-sampler (#19363) |
commit | commitdiff | tree |
| 2026-02-06 |
Georgi Gerganov | cuda : cuda graphs now compare all node params (#19383) |
commit | commitdiff | tree |
| 2026-02-05 |
Georgi Gerganov | metal : adaptive CPU/GPU interleave based on number... |
commit | commitdiff | tree |
| 2026-02-05 |
Jeff Bolz | vulkan: Preprocess FA mask to detect all-neg-inf and... |
commit | commitdiff | tree |
| 2026-02-05 |
Georgi Gerganov | benches : update models + numbers (#19359) |
commit | commitdiff | tree |
| 2026-02-05 |
Sigbjørn Skjæret | docker : fix vulkan build (#19352) |
commit | commitdiff | tree |
| 2026-02-05 |
Adrien Gallouët | vendor : update BoringSSL to 0.20260204.0 (#19333) |
commit | commitdiff | tree |
| 2026-02-05 |
Georgi Gerganov | metal : add diag (#19330) |
commit | commitdiff | tree |
| 2026-02-05 |
Oleksandr Kuvshynov | vulkan: fix GPU deduplication logic. (#19222) |
commit | commitdiff | tree |
| 2026-02-05 |
Jeff Bolz | vulkan: Set k_load_shmem to false when K is too large... |
commit | commitdiff | tree |
| 2026-02-05 |
Jeff Bolz | vulkan: fix non-contig rope (#19299) |
commit | commitdiff | tree |
| 2026-02-05 |
will-lms | metal : add missing includes (#19348) |
commit | commitdiff | tree |
| 2026-02-05 |
Sigbjørn Skjæret | vendor : add missing llama_add_compile_flags (#19322) |
commit | commitdiff | tree |
| 2026-02-04 |
Aaron Teo | vendor: update cpp-httplib version (#19313) |
commit | commitdiff | tree |
| 2026-02-04 |
Daniel Bevenius | codeowners : add danbev for examples/debug (#19332) |
commit | commitdiff | tree |
| 2026-02-04 |
Xuan-Son Nguyen | debug: make common_debug_print_tensor readable (#19331) |
commit | commitdiff | tree |
| 2026-02-04 |
Georgi Gerganov | ci : fix sanitize workflow to enable ggml sanitizers... |
commit | commitdiff | tree |
| 2026-02-04 |
Xuan-Son Nguyen | model: (qwen3next) correct vectorized key_gdiff calcula... |
commit | commitdiff | tree |
| 2026-02-04 |
Georgi Gerganov | tests : add non-cont, inplace rope tests (#19296) |
commit | commitdiff | tree |
| 2026-02-04 |
Daniel Bevenius | model-conversion : add tensor-info.py utility (#18954) |
commit | commitdiff | tree |
| 2026-02-04 |
Georgi Gerganov | spec : fix the check-rate logic of ngram-simple (#19261) |
commit | commitdiff | tree |
| 2026-02-04 |
Daniel Bevenius | completion : simplify batch (embd) processing (#19286) |
commit | commitdiff | tree |
| 2026-02-04 |
Kevin Pouget | ggml-virtgpu: make the code thread safe (#19204) |
commit | commitdiff | tree |
| 2026-02-04 |
Aman Gupta | ggml-cpu: use LUT for converting e8->f32 scales on... |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | metal : add solve_tri (#19302) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | ci : add sanitizer runs for server (#19291) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | sampling : delegate input allocation to the scheduler... |
commit | commitdiff | tree |
| 2026-02-03 |
Ruben Ortlam | vulkan: disable coopmat1 fa on Nvidia Turing (#19290) |
commit | commitdiff | tree |
| 2026-02-03 |
Aman Gupta | CUDA: use mmvq for mul-mat-id for small batch sizes... |
commit | commitdiff | tree |
| 2026-02-03 |
Sigbjørn Skjæret | models : remove unnecessary cont in openelm (#19289) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | metal : minor cleanup (#19251) |
commit | commitdiff | tree |
| 2026-02-03 |
Oliver Simons | CUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_f... |
commit | commitdiff | tree |
| 2026-02-03 |
George | ggml: added cleanups in ggml_quantize_free (#19278) |
commit | commitdiff | tree |
| 2026-02-03 |
Gaurav Garg | cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until... |
commit | commitdiff | tree |
| 2026-02-03 |
Alexey Dubrov | vocab: add Falcon-H1-Tiny-Coder FIM tokens (#19249) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | spec : simplify time measurement using common_time_meas... |
commit | commitdiff | tree |
| 2026-02-02 |
lhez | opencl: refactor some ops, concat, repeat, tanh and... |
commit | commitdiff | tree |
| 2026-02-02 |
Sid Mohan | jinja : add missing 'in' test to template engine (... |
commit | commitdiff | tree |
| 2026-02-02 |
Xuan-Son Nguyen | mtmd: add min/max pixels gguf metadata (#19273) |
commit | commitdiff | tree |
| 2026-02-02 |
Aman Gupta | ggml-cpu: FA split across kv for faster TG (#19209) |
commit | commitdiff | tree |
| 2026-02-02 |
Matthieu Coudron | server: print actual model name in 'model not found... |
commit | commitdiff | tree |
| 2026-02-02 |
Aman Gupta | ci: add test-backend-ops test for CPU (#19268) |
commit | commitdiff | tree |
| 2026-02-02 |
Neo Zhang | Remove support for Nvidia & AMD GPU, because the oneAPI... |
commit | commitdiff | tree |
| 2026-02-02 |
Tamar | sycl: implement GGML_OP_TOP_K (#19242) |
commit | commitdiff | tree |
| 2026-02-02 |
Georgi Gerganov | metal : support virtual devices (#18919) |
commit | commitdiff | tree |
| 2026-02-02 |
Daniel Bevenius | model-conversion : add debug option to conversion scrip... |
commit | commitdiff | tree |
| 2026-02-02 |
Johannes Gäßler | ggml-backend: fix async set/get fallback sync (#19179) |
commit | commitdiff | tree |
| 2026-02-02 |
Georgi Gerganov | authors : update (#19263) |
commit | commitdiff | tree |
| 2026-02-02 |
Christian Kastner | docs : Minor cleanups (#19252) |
commit | commitdiff | tree |
| 2026-02-02 |
Sascha Rogmann | spec : various improvements ton ngram-map + docs (... |
commit | commitdiff | tree |
| 2026-02-02 |
Nikhil Jain | Remove pipeline cache mutexes (#19195) |
commit | commitdiff | tree |
| 2026-02-01 |
Max Krasnyansky | Bump cmake max version (needed for Windows on Snapdrago... |
commit | commitdiff | tree |
| 2026-02-01 |
Alexis Williams | nix: fix allowUnfreePredicate for packages with multipl... |
commit | commitdiff | tree |
| 2026-02-01 |
Neo Zhang | create test.sh to enhance the parameters for testing... |
commit | commitdiff | tree |
| 2026-01-31 |
Matthieu Coudron | nix: fix nix develop .#python-scripts (#19218) |
commit | commitdiff | tree |
| 2026-01-31 |
nullname | ggml-hexagon: flash-attention and reduce-sum optimizati... |
commit | commitdiff | tree |
| 2026-01-31 |
EugeoSynthesisThirtyTwo | quantize: add option --tensor-type-file to llama-quanti... |
commit | commitdiff | tree |
| 2026-01-30 |
tc-mb | mtmd: support MiniCPM-o 4.5(vision only) (#19211) |
commit | commitdiff | tree |
| 2026-01-30 |
Daniele Pinna | lookup, lookahead: fix crash when n_ctx not specified... |
commit | commitdiff | tree |
| 2026-01-30 |
Georgi Gerganov | ngram-mod : fix build [no ci] (#19216) |
commit | commitdiff | tree |
| 2026-01-30 |
shaofeiqi | opencl: add optimized q8_0 mm kernel for adreno (#18871) |
commit | commitdiff | tree |
| 2026-01-30 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2026-01-30 |
Georgi Gerganov | cuda : fix compile warnings (whisper/0) |
commit | commitdiff | tree |
| 2026-01-30 |
Georgi Gerganov | server : wrap around the "id_slot" parameter (#19207) |
commit | commitdiff | tree |
| 2026-01-30 |
Simon Redman | Correctly fetch q8_1 quantize pipeline in test as neede... |
commit | commitdiff | tree |
| 2026-01-30 |
Georgi Gerganov | spec : add ngram-mod (#19164) |
commit | commitdiff | tree |
| 2026-01-30 |
Marcello Seri | jinja : add unordered_map include to value.h [no ci... |
commit | commitdiff | tree |
| 2026-01-30 |
Daniel Bevenius | memory : clarify comments for r_l and s_l tensors ... |
commit | commitdiff | tree |
| 2026-01-30 |
Georgi Gerganov | tests : add GQA=20 FA test (#19095) |
commit | commitdiff | tree |
| 2026-01-30 |
Daniel Bevenius | convert : add missing return statement for GraniteMoeMo... |
commit | commitdiff | tree |
| 2026-01-30 |
Daniel Bevenius | memory : remove unused tmp_buf (#19199) |
commit | commitdiff | tree |
| 2026-01-30 |
Antonis Makropoulos | docs: Add LlamaLib to UI projects (#19181) |
commit | commitdiff | tree |
| 2026-01-30 |
bssrdf | add tensor type checking as part of cuda graph properti... |
commit | commitdiff | tree |
| 2026-01-30 |
s8322 | sycl: implement GGML_UNARY_OP_SOFTPLUS (#19114) |
commit | commitdiff | tree |
| 2026-01-30 |
RachelMantel | sycl: implement GGML_OP_TRI (#19089) |
commit | commitdiff | tree |
| 2026-01-30 |
DDXDB | Fix typos in SYCL documentation (#19162) |
commit | commitdiff | tree |
| 2026-01-29 |
Zheyuan Chen | ggml-webgpu: improve flastAttention performance by... |
commit | commitdiff | tree |
| 2026-01-29 |
Todor Boinovski | hexagon: enable offloading to Hexagon on Windows on... |
commit | commitdiff | tree |
| 2026-01-29 |
Georgi Gerganov | cuda : fix nkvo, offload and cuda graph node properties... |
commit | commitdiff | tree |
| 2026-01-29 |
Aldehir Rojas | chat : add parsing for solar-open-100b (#18540) |
commit | commitdiff | tree |
| 2026-01-29 |
Andrew Marshall | webui: Update Svelte to fix effect_update_depth_exceede... |
commit | commitdiff | tree |
| 2026-01-29 |
Sigbjørn Skjæret | jinja : do not pass empty tools and add some none filte... |
commit | commitdiff | tree |
| 2026-01-29 |
yulo | HIP: add mmf for CDNA (#18896) |
commit | commitdiff | tree |
| 2026-01-29 |
Georgi Gerganov | arg : add -kvu to llama-batched-bench (#19172) |
commit | commitdiff | tree |
| 2026-01-29 |
Vishal Singh | ggml-zendnn : resolve ZenDNN backend cross-module symbo... |
commit | commitdiff | tree |
| 2026-01-29 |
Aman Gupta | CUDA: refactor topk-moe to enable more models (GLM... |
commit | commitdiff | tree |
| 2026-01-29 |
Neo Zhang | sycl: fix norm kernels: l2_norm, group_norm, rms_norm... |
commit | commitdiff | tree |
| 2026-01-28 |
Sigbjørn Skjæret | ci : find latest release with asset for winget (#19161) |
commit | commitdiff | tree |
| 2026-01-28 |
Ruben Ortlam | Vulkan Flash Attention Coopmat1 Refactor (#19075) |
commit | commitdiff | tree |
| 2026-01-28 |
Sascha Rogmann | spec : add self‑speculative decoding (no draft model... |
commit | commitdiff | tree |
| 2026-01-28 |
Daniel Bevenius | convert : yield Mamba2Model/GraniteMoeModel modify_tens... |
commit | commitdiff | tree |
| 2026-01-28 |
Patryk Kaminski | ggml-sycl: remove unused syclcompat header (#19140) |
commit | commitdiff | tree |
| 2026-01-28 |
Sigbjørn Skjæret | jinja : undefined should be treated as sequence/iterabl... |
commit | commitdiff | tree |
| 2026-01-28 |
Oleksandr Kuvshynov | vulkan: handle device dedup on MacOS + Vega II Duo... |
commit | commitdiff | tree |
| 2026-01-28 |
Ben Chen | doc: add build instruction to use Vulkan backend on... |
commit | commitdiff | tree |
| 2026-01-28 |
Kevin Pouget | ggml: new backend for Virglrenderer API Remoting accele... |
commit | commitdiff | tree |
| next |