| 2026-02-12 |
Aleksander... | WebUI Architecture Cleanup (#19541) |
commit | commitdiff | tree |
| 2026-02-12 |
Georgi Gerganov | metal : update sum_rows kernel to support float4 (... |
commit | commitdiff | tree |
| 2026-02-12 |
Mario Limonciello | Add a workaround for compilation with ROCWMMA_FATTN... |
commit | commitdiff | tree |
| 2026-02-12 |
RichardScottOZ | server : fix typo in README.md for features list (... |
commit | commitdiff | tree |
| 2026-02-12 |
TriDefender | docs : update path in snapdragon README.md (#19533) |
commit | commitdiff | tree |
| 2026-02-12 |
Max Krasnyansky | hexagon: further optimization and tuning of matmul... |
commit | commitdiff | tree |
| 2026-02-12 |
Adrien Gallouët | common : replace deprecated codecvt using parse_utf8_co... |
commit | commitdiff | tree |
| 2026-02-11 |
lhez | opencl: add general Q6_K mm and Q4_K mv (#19347) |
commit | commitdiff | tree |
| 2026-02-11 |
Georgi Gerganov | ggml : unary ops support non-cont src0 + metal F16... |
commit | commitdiff | tree |
| 2026-02-11 |
Daniel Bevenius | common : remove unused token util functions (#19506) |
commit | commitdiff | tree |
| 2026-02-11 |
AesSedai | model: Add Kimi-K2.5 support (#19170) |
commit | commitdiff | tree |
| 2026-02-11 |
Daniel Bevenius | build : fix case in dSYMs path for build-macos [no... |
commit | commitdiff | tree |
| 2026-02-11 |
Georgi Gerganov | metal : extend l2_norm support for non-cont src0 (... |
commit | commitdiff | tree |
| 2026-02-11 |
Johannes Gäßler | docs: ban AI for issues and discussions [no CI] (#19512) |
commit | commitdiff | tree |
| 2026-02-11 |
Adrien Gallouët | common : improve download error reporting (#19491) |
commit | commitdiff | tree |
| 2026-02-11 |
Max Krasnyansky | hexagon: Add ARGSORT, DIV, SQR, SQRT, SUM_ROWS, GEGLU... |
commit | commitdiff | tree |
| 2026-02-11 |
thecaptain789 | llama : correct typos 'occured' and 'occurences' (... |
commit | commitdiff | tree |
| 2026-02-11 |
Georgi Gerganov | model : fix wavtokenizer embedding notions (#19479) |
commit | commitdiff | tree |
| 2026-02-11 |
Georgi Gerganov | ggml : extend bin bcast for permuted src1 (#19484) |
commit | commitdiff | tree |
| 2026-02-11 |
Georgi Gerganov | metal : consolidate unary ops (#19490) |
commit | commitdiff | tree |
| 2026-02-11 |
Daniel Bevenius | llama : refactor sampling_info to use buffer_view templ... |
commit | commitdiff | tree |
| 2026-02-10 |
Oliver Simons | CUDA : Update CCCL-tag for 3.2 to final release from... |
commit | commitdiff | tree |
| 2026-02-10 |
Nikhil Jain | [WebGPU] Plug memory leaks and free resources on shutdo... |
commit | commitdiff | tree |
| 2026-02-10 |
JJJYmmm | models : support qwen3.5 series (#19468) |
commit | commitdiff | tree |
| 2026-02-10 |
Xuan-Son Nguyen | test: fix IMROPE perf test case (#19465) |
commit | commitdiff | tree |
| 2026-02-10 |
Alberto Cabrera... | ggml-cpu: arm64: q6_K repack gemm and gemv (and generic... |
commit | commitdiff | tree |
| 2026-02-10 |
k4ss4n | ggml : use noexcept overload for is_regular_file in... |
commit | commitdiff | tree |
| 2026-02-10 |
Piotr Wilkin... | convert : move experts permutation from Qwen2MoeModel... |
commit | commitdiff | tree |
| 2026-02-10 |
Daniel Bevenius | tts : fix typos in README.md [no ci] (#19463) |
commit | commitdiff | tree |
| 2026-02-10 |
Raul Torres | CANN: Remove unnecessary wrapper for `gml_backend_buft_... |
commit | commitdiff | tree |
| 2026-02-10 |
hipudding | CANN: implement quantized MUL_MAT_ID for MoE models... |
commit | commitdiff | tree |
| 2026-02-10 |
Georgi Gerganov | cuda : extend GGML_OP_PAD to work with non-cont src0... |
commit | commitdiff | tree |
| 2026-02-09 |
Xuan-Son Nguyen | chat: fix case where template accepts type content... |
commit | commitdiff | tree |
| 2026-02-09 |
Tarek Dakhran | mtmd: Implement tiling for LFM2-VL (#19454) |
commit | commitdiff | tree |
| 2026-02-09 |
손희준 | Server: log when converting requests to chat completion... |
commit | commitdiff | tree |
| 2026-02-09 |
Sascha Rogmann | spec : remove check rate (#19377) |
commit | commitdiff | tree |
| 2026-02-09 |
Georgi Gerganov | ci : add metal server workflows (#19293) |
commit | commitdiff | tree |
| 2026-02-09 |
Georgi Gerganov | revert : "[Model] Qwen3.5 dense and MoE support (no... |
commit | commitdiff | tree |
| 2026-02-09 |
Kevin Pouget | ggml-virtgpu: add backend documentation (#19354) |
commit | commitdiff | tree |
| 2026-02-09 |
Hugo | cmake : add variable to skip installing tests (#19370) |
commit | commitdiff | tree |
| 2026-02-08 |
Piotr Wilkin... | [Model] Qwen3.5 dense and MoE support (no vision) ... |
commit | commitdiff | tree |
| 2026-02-08 |
Oliver Simons | CUDA: Fix non-contig rope (#19338) |
commit | commitdiff | tree |
| 2026-02-08 |
Adrien Gallouët | rpc : update from common.cpp (#19400) |
commit | commitdiff | tree |
| 2026-02-08 |
Georgi Gerganov | server : improve context checkpoint logic (#19408) |
commit | commitdiff | tree |
| 2026-02-08 |
ddh0 | llama-quantize : cleanup `--help` output (#19317) |
commit | commitdiff | tree |
| 2026-02-08 |
Sigbjørn Skjæret | ci : remove server job from webui and move slow test... |
commit | commitdiff | tree |
| 2026-02-07 |
Georgi Gerganov | ci : use -j param correctly when building with sanitize... |
commit | commitdiff | tree |
| 2026-02-07 |
Georgi Gerganov | metal : consolidate bin kernels (#19390) |
commit | commitdiff | tree |
| 2026-02-07 |
Georgi Gerganov | metal : fix event synchronization in cpy_tensor_async... |
commit | commitdiff | tree |
| 2026-02-06 |
forforever73 | model : support Step3.5-Flash (#19283) |
commit | commitdiff | tree |
| 2026-02-06 |
Alex Trotta | gguf-py : bump sentencepiece version (#19319) |
commit | commitdiff | tree |
| 2026-02-06 |
Abhijit Ramesh | ggml-webgpu: JIT compile binary operators and handle... |
commit | commitdiff | tree |
| 2026-02-06 |
Nechama Krashinski | sycl: add F16 support for GGML_OP_CEIL (#19306) |
commit | commitdiff | tree |
| 2026-02-06 |
Jeff Bolz | tests: reduce number of FA test permutations (#19381) |
commit | commitdiff | tree |
| 2026-02-06 |
Georgi Gerganov | common : add common_speculative_is_compat() (#19270) |
commit | commitdiff | tree |
| 2026-02-06 |
Lasse Lauwerys | unicode : MSVC regex fix (#19340) |
commit | commitdiff | tree |
| 2026-02-06 |
ymcki | Kimi-Linear support (backend agnostic + MLA KV cache... |
commit | commitdiff | tree |
| 2026-02-06 |
Jeff Bolz | vulkan: For coopmat2 FA, use fp16 accumulators for... |
commit | commitdiff | tree |
| 2026-02-06 |
Jeff Bolz | vulkan: make FA mask/softcap enables spec constants... |
commit | commitdiff | tree |
| 2026-02-06 |
Georgi Gerganov | metal : skip loading all-zero mask (#19337) |
commit | commitdiff | tree |
| 2026-02-06 |
Daniel Bevenius | llama : rename llama-sampling to llama-sampler (#19363) |
commit | commitdiff | tree |
| 2026-02-06 |
Georgi Gerganov | cuda : cuda graphs now compare all node params (#19383) |
commit | commitdiff | tree |
| 2026-02-05 |
Georgi Gerganov | metal : adaptive CPU/GPU interleave based on number... |
commit | commitdiff | tree |
| 2026-02-05 |
Jeff Bolz | vulkan: Preprocess FA mask to detect all-neg-inf and... |
commit | commitdiff | tree |
| 2026-02-05 |
Georgi Gerganov | benches : update models + numbers (#19359) |
commit | commitdiff | tree |
| 2026-02-05 |
Sigbjørn Skjæret | docker : fix vulkan build (#19352) |
commit | commitdiff | tree |
| 2026-02-05 |
Adrien Gallouët | vendor : update BoringSSL to 0.20260204.0 (#19333) |
commit | commitdiff | tree |
| 2026-02-05 |
Georgi Gerganov | metal : add diag (#19330) |
commit | commitdiff | tree |
| 2026-02-05 |
Oleksandr Kuvshynov | vulkan: fix GPU deduplication logic. (#19222) |
commit | commitdiff | tree |
| 2026-02-05 |
Jeff Bolz | vulkan: Set k_load_shmem to false when K is too large... |
commit | commitdiff | tree |
| 2026-02-05 |
Jeff Bolz | vulkan: fix non-contig rope (#19299) |
commit | commitdiff | tree |
| 2026-02-05 |
will-lms | metal : add missing includes (#19348) |
commit | commitdiff | tree |
| 2026-02-05 |
Sigbjørn Skjæret | vendor : add missing llama_add_compile_flags (#19322) |
commit | commitdiff | tree |
| 2026-02-04 |
Aaron Teo | vendor: update cpp-httplib version (#19313) |
commit | commitdiff | tree |
| 2026-02-04 |
Daniel Bevenius | codeowners : add danbev for examples/debug (#19332) |
commit | commitdiff | tree |
| 2026-02-04 |
Xuan-Son Nguyen | debug: make common_debug_print_tensor readable (#19331) |
commit | commitdiff | tree |
| 2026-02-04 |
Georgi Gerganov | ci : fix sanitize workflow to enable ggml sanitizers... |
commit | commitdiff | tree |
| 2026-02-04 |
Xuan-Son Nguyen | model: (qwen3next) correct vectorized key_gdiff calcula... |
commit | commitdiff | tree |
| 2026-02-04 |
Georgi Gerganov | tests : add non-cont, inplace rope tests (#19296) |
commit | commitdiff | tree |
| 2026-02-04 |
Daniel Bevenius | model-conversion : add tensor-info.py utility (#18954) |
commit | commitdiff | tree |
| 2026-02-04 |
Georgi Gerganov | spec : fix the check-rate logic of ngram-simple (#19261) |
commit | commitdiff | tree |
| 2026-02-04 |
Daniel Bevenius | completion : simplify batch (embd) processing (#19286) |
commit | commitdiff | tree |
| 2026-02-04 |
Kevin Pouget | ggml-virtgpu: make the code thread safe (#19204) |
commit | commitdiff | tree |
| 2026-02-04 |
Aman Gupta | ggml-cpu: use LUT for converting e8->f32 scales on... |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | metal : add solve_tri (#19302) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | ci : add sanitizer runs for server (#19291) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | sampling : delegate input allocation to the scheduler... |
commit | commitdiff | tree |
| 2026-02-03 |
Ruben Ortlam | vulkan: disable coopmat1 fa on Nvidia Turing (#19290) |
commit | commitdiff | tree |
| 2026-02-03 |
Aman Gupta | CUDA: use mmvq for mul-mat-id for small batch sizes... |
commit | commitdiff | tree |
| 2026-02-03 |
Sigbjørn Skjæret | models : remove unnecessary cont in openelm (#19289) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | metal : minor cleanup (#19251) |
commit | commitdiff | tree |
| 2026-02-03 |
Oliver Simons | CUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_f... |
commit | commitdiff | tree |
| 2026-02-03 |
George | ggml: added cleanups in ggml_quantize_free (#19278) |
commit | commitdiff | tree |
| 2026-02-03 |
Gaurav Garg | cuda : revert CUDA_SCALE_LAUNCH_QUEUES override until... |
commit | commitdiff | tree |
| 2026-02-03 |
Alexey Dubrov | vocab: add Falcon-H1-Tiny-Coder FIM tokens (#19249) |
commit | commitdiff | tree |
| 2026-02-03 |
Georgi Gerganov | spec : simplify time measurement using common_time_meas... |
commit | commitdiff | tree |
| 2026-02-02 |
lhez | opencl: refactor some ops, concat, repeat, tanh and... |
commit | commitdiff | tree |
| 2026-02-02 |
Sid Mohan | jinja : add missing 'in' test to template engine (... |
commit | commitdiff | tree |
| 2026-02-02 |
Xuan-Son Nguyen | mtmd: add min/max pixels gguf metadata (#19273) |
commit | commitdiff | tree |
| 2026-02-02 |
Aman Gupta | ggml-cpu: FA split across kv for faster TG (#19209) |
commit | commitdiff | tree |
| next |