| 2025-12-10 |
Xuan-Son Nguyen | mtmd: some small clean up (#17909) |
commit | commitdiff | tree |
| 2025-12-10 |
Xuan-Son Nguyen | cli: enable jinja by default (#17911) |
commit | commitdiff | tree |
| 2025-12-10 |
Pascal | server: add presets (config) when using multiple models... |
commit | commitdiff | tree |
| 2025-12-10 |
Max Krasnyansky | Fix race conditions in threadpool when dealing with... |
commit | commitdiff | tree |
| 2025-12-10 |
Georgi Gerganov | ggml : remove GGML_KQ_MASK_PAD constant (#17910) |
commit | commitdiff | tree |
| 2025-12-10 |
Sigbjørn Skjæret | cuda : add missing support check for xielu (#17895) |
commit | commitdiff | tree |
| 2025-12-10 |
Xuan-Son Nguyen | cli: new CLI experience (#17824) |
commit | commitdiff | tree |
| 2025-12-10 |
Eric Zhang | model : Qwen3-Next-80B-A3B has 48 layers (#17898) |
commit | commitdiff | tree |
| 2025-12-10 |
lhez | docs : update opencl ops (#17904) |
commit | commitdiff | tree |
| 2025-12-10 |
Johannes Gäßler | CUDA: fix unpadded strides in MMA FA kernel (#17891) |
commit | commitdiff | tree |
| 2025-12-10 |
Xuan-Son Nguyen | convert: allow using quantized Mistral weight (#17889) |
commit | commitdiff | tree |
| 2025-12-10 |
Neo Zhang Jianyu | fix softmax for iGPU (#17838) |
commit | commitdiff | tree |
| 2025-12-09 |
Aldehir Rojas | common : add parser for ministral/mistral large 3/devst... |
commit | commitdiff | tree |
| 2025-12-09 |
Sigbjørn Skjæret | docs : update cpu and cuda ops (#17890) |
commit | commitdiff | tree |
| 2025-12-09 |
Gabe Goodhart | metal: SSM kernel improvements (#17876) |
commit | commitdiff | tree |
| 2025-12-09 |
Piotr Wilkin... | Add DIAG for CUDA (#17873) |
commit | commitdiff | tree |
| 2025-12-09 |
Johannes Gäßler | docs: clarify that CPU support should be first (#17886) |
commit | commitdiff | tree |
| 2025-12-09 |
Gabe Goodhart | ggml : Provide macos-specific backtrace printing to... |
commit | commitdiff | tree |
| 2025-12-09 |
Georgi Gerganov | metal : print node names for debugging (#17882) |
commit | commitdiff | tree |
| 2025-12-09 |
Sigbjørn Skjæret | ggml : allow fill node alloc inplace (#17870) |
commit | commitdiff | tree |
| 2025-12-09 |
Rhys-T | cmake: fix Mach-O current version number (#17877) |
commit | commitdiff | tree |
| 2025-12-09 |
Sigbjørn Skjæret | model : nit, DeepSeek V1 MoE is 16B and GigaChat is... |
commit | commitdiff | tree |
| 2025-12-09 |
Xuan-Son Nguyen | console: allow using arrow left/right, home/end keys... |
commit | commitdiff | tree |
| 2025-12-09 |
Chenguang Li | CANN: add support for partial RoPE and Vision mode... |
commit | commitdiff | tree |
| 2025-12-09 |
Johannes Gäßler | CUDA: fix FP16 overflow in tile FA kernel (#17875) |
commit | commitdiff | tree |
| 2025-12-09 |
Aldehir Rojas | llama : add token matching support to llama-grammar... |
commit | commitdiff | tree |
| 2025-12-09 |
philip-essential | model : support Rnj-1 (#17811) |
commit | commitdiff | tree |
| 2025-12-08 |
Sigbjørn Skjæret | graph : use fill instead of scale_bias in grouped exper... |
commit | commitdiff | tree |
| 2025-12-08 |
Daniel Bevenius | model-conversion : add token ids to prompt token output... |
commit | commitdiff | tree |
| 2025-12-08 |
Xuan-Son Nguyen | server: delegate result_state creation to server_task... |
commit | commitdiff | tree |
| 2025-12-08 |
Neo Zhang | ci : support bfloat16 SYCL release package (#17855) |
commit | commitdiff | tree |
| 2025-12-08 |
Xuan-Son Nguyen | server: improve speed of speculative decoding (#17808) |
commit | commitdiff | tree |
| 2025-12-08 |
Piotr Wilkin... | Make graph_max_nodes vary by ubatch size (#17794) |
commit | commitdiff | tree |
| 2025-12-08 |
hksdpc255 | Fix Kimi-K2 tool-call parsing issues (#17376) |
commit | commitdiff | tree |
| 2025-12-08 |
Jay Zenith | cuda : add FILL op support (#17851) |
commit | commitdiff | tree |
| 2025-12-08 |
Xuan-Son Nguyen | server : add development documentation (#17760) |
commit | commitdiff | tree |
| 2025-12-08 |
Georgi Gerganov | server : make cache_reuse configurable per request... |
commit | commitdiff | tree |
| 2025-12-08 |
wsbagnsv1 | cuda: optimize SOLVE_TRI using registers and FMAF ... |
commit | commitdiff | tree |
| 2025-12-08 |
ixgbe | ggml-cpu: add ggml_thread_cpu_relax with Zihintpause... |
commit | commitdiff | tree |
| 2025-12-07 |
Xuan-Son Nguyen | model: add llama 4 scaling for mistral-large (deepseek... |
commit | commitdiff | tree |
| 2025-12-07 |
lovedheart | Vulkan: improve mul_mat_vec_iq1_m (#16907) |
commit | commitdiff | tree |
| 2025-12-07 |
Sigbjørn Skjæret | ci : add windows-cuda 13.1 release (#17839) |
commit | commitdiff | tree |
| 2025-12-07 |
Sigbjørn Skjæret | common : change --color to accept on/off/auto, default... |
commit | commitdiff | tree |
| 2025-12-07 |
Law Po Ying | sycl: add missing BF16 conversion support for Intel... |
commit | commitdiff | tree |
| 2025-12-06 |
Jeff Bolz | vulkan: perf_logger improvements (#17672) |
commit | commitdiff | tree |
| 2025-12-06 |
Vishal Singh | ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690) |
commit | commitdiff | tree |
| 2025-12-06 |
Xuan-Son Nguyen | server: support multiple generations from one prompt... |
commit | commitdiff | tree |
| 2025-12-06 |
Phylliida Dev | ggml : add circular tiling support to pad, for Vulkan... |
commit | commitdiff | tree |
| 2025-12-06 |
Johannes Gäßler | HIP: fix RDNA3 FP16/BF16 matrix multiplication (#17817) |
commit | commitdiff | tree |
| 2025-12-06 |
Aleksander... | webui: Stop generation from chat sidebar (#17806) |
commit | commitdiff | tree |
| 2025-12-06 |
Aleksander... | webui: Fix context available value in Multi-model Route... |
commit | commitdiff | tree |
| 2025-12-06 |
Aleksander... | webui: Per-conversation system message with UI displayi... |
commit | commitdiff | tree |
| 2025-12-06 |
Sky | ggml : improve error handling for search path existence... |
commit | commitdiff | tree |
| 2025-12-06 |
Daniel Bevenius | llama : remove quantization sanity check (#17788) |
commit | commitdiff | tree |
| 2025-12-06 |
Jeff Bolz | vulkan: Use one row per workgroup for f32 mmv (#17711) |
commit | commitdiff | tree |
| 2025-12-06 |
Xuan-Son Nguyen | convert: support Mistral 3 Large MoE (#17730) |
commit | commitdiff | tree |
| 2025-12-06 |
Jeff Bolz | vulkan: support solve_tri with larger N/K values (... |
commit | commitdiff | tree |
| 2025-12-06 |
Georgi Gerganov | contrib : stale PRs (#17803) |
commit | commitdiff | tree |
| 2025-12-06 |
Georgi Gerganov | metal : fix build(#17799) |
commit | commitdiff | tree |
| 2025-12-06 |
Masato Nakasaka | vulkan: Replace deprecated VK_EXT_validation_features... |
commit | commitdiff | tree |
| 2025-12-06 |
Masato Nakasaka | vulkan: Fix mismatch in TOPK_MOE unit test (#17541) |
commit | commitdiff | tree |
| 2025-12-05 |
Jeff Bolz | vulkan: add more num_blocks instantiations in rms_norm... |
commit | commitdiff | tree |
| 2025-12-05 |
Jeff Bolz | vulkan: fix top_k bug when there are ties in the input... |
commit | commitdiff | tree |
| 2025-12-05 |
Acly | vulkan : support conv-2d with large output size (#17685) |
commit | commitdiff | tree |
| 2025-12-05 |
Reese Levine | ggml webgpu: unary op suppport, code refactoring, ops... |
commit | commitdiff | tree |
| 2025-12-05 |
Jeff Bolz | vulkan: enable mmvq for q2_k on NVIDIA (#17675) |
commit | commitdiff | tree |
| 2025-12-05 |
Jeff Bolz | vulkan: set all memory allocations to high priority... |
commit | commitdiff | tree |
| 2025-12-05 |
Georgi Gerganov | rpc : fix alloc size logic (#17116) |
commit | commitdiff | tree |
| 2025-12-05 |
Georgi Gerganov | metal : add residency sets keep-alive heartbeat (#17766) |
commit | commitdiff | tree |
| 2025-12-05 |
Johannes Gäßler | HIP : fix RDNA4 build (#17792) |
commit | commitdiff | tree |
| 2025-12-05 |
Pascal | fix: prevent segfault in tokenizer on highly repetitive... |
commit | commitdiff | tree |
| 2025-12-05 |
Adrien Gallouët | ci : fix winget workflow (#17790) |
commit | commitdiff | tree |
| 2025-12-05 |
shalinib-ibm | Q4/Q8 Tiled Gemm Optimization. (#16999) |
commit | commitdiff | tree |
| 2025-12-05 |
Piotr Wilkin... | Add pwilkin to CODEOWNERS for chat files (#17789) |
commit | commitdiff | tree |
| 2025-12-05 |
Johannes Gäßler | CUDA: fix FA VKQ accumulator overflow (#17746) |
commit | commitdiff | tree |
| 2025-12-05 |
Jiacheng (Jason... | HIP: enable WMMA-MMQ INT kernels for RDNA 3 (#17576) |
commit | commitdiff | tree |
| 2025-12-05 |
Sigbjørn Skjæret | ci : transform release binary root dir in tar to llama... |
commit | commitdiff | tree |
| 2025-12-04 |
Gabe Goodhart | docs : update ops.md (Metal, BLAS) (#17768) |
commit | commitdiff | tree |
| 2025-12-04 |
Piotr Wilkin... | Add support for CUMSUM and TRI for CUDA. (#17584) |
commit | commitdiff | tree |
| 2025-12-04 |
Gabe Goodhart | metal: TRI, FILL, EXPM1, SOFTPLUS (#16623) |
commit | commitdiff | tree |
| 2025-12-04 |
Xuan-Son Nguyen | server: strip content-length header on proxy (#17734) |
commit | commitdiff | tree |
| 2025-12-04 |
Xuan-Son Nguyen | server: move msg diffs tracking to HTTP thread (#17740) |
commit | commitdiff | tree |
| 2025-12-04 |
Daniel Bevenius | examples : add missing code block end marker [no ci... |
commit | commitdiff | tree |
| 2025-12-04 |
Daniel Bevenius | common : skip model validation when --help is requested... |
commit | commitdiff | tree |
| 2025-12-04 |
Alberto Cabrera... | ggml-cpu : remove asserts always evaluating to false... |
commit | commitdiff | tree |
| 2025-12-04 |
SmartestWashingMachine | convert: use existing local chat_template if mistral... |
commit | commitdiff | tree |
| 2025-12-04 |
Adrien Gallouët | cmake : simplify build info detection using standard... |
commit | commitdiff | tree |
| 2025-12-04 |
Sigbjørn Skjæret | ci : disable ggml-ci-x64-amd-* (#17753) |
commit | commitdiff | tree |
| 2025-12-04 |
Adrien Gallouët | common: use native MultiByteToWideChar (#17738) |
commit | commitdiff | tree |
| 2025-12-04 |
Georgi Gerganov | metal : use params per pipeline instance (#17739) |
commit | commitdiff | tree |
| 2025-12-04 |
Georgi Gerganov | llama : fix sanity checks during quantization (#17721) |
commit | commitdiff | tree |
| 2025-12-04 |
Adrien Gallouët | build : move _WIN32_WINNT definition to headers (#17736) |
commit | commitdiff | tree |
| 2025-12-04 |
Jeff Bolz | build: enable parallel builds in msbuild using MTT... |
commit | commitdiff | tree |
| 2025-12-03 |
Herman Semenoff | ggml-cpu: remove duplicate conditional check 'iid'... |
commit | commitdiff | tree |
| 2025-12-03 |
Piotr Wilkin... | Add a couple of file types to the text section (#17670) |
commit | commitdiff | tree |
| 2025-12-03 |
SmartestWashingMachine | convert : support latest mistral-common (fix conversion... |
commit | commitdiff | tree |
| 2025-12-03 |
Aleksander... | Use OpenAI-compatible `/v1/models` endpoint by default... |
commit | commitdiff | tree |
| 2025-12-03 |
Andika Wasisto | webui: Fix zero pasteLongTextToFileLen to disable conve... |
commit | commitdiff | tree |
| 2025-12-03 |
Johannes Gäßler | CUDA: generalized (mma) FA, add Volta support (#17505) |
commit | commitdiff | tree |
| 2025-12-03 |
Georgi Gerganov | chat : reserve memory in compute_diffs and improve... |
commit | commitdiff | tree |
| next |