| 2025-10-14 |
Jeff Bolz | vulkan: Improve build time for MSVC (#16545) |
commit | commitdiff | tree |
| 2025-10-14 |
Johannes Gäßler | CUDA: enable FA for FP32 KV cache (#16546) |
commit | commitdiff | tree |
| 2025-10-14 |
Aman Gupta | CUDA: use fastdiv + ggml_cuda_mad for mmvf (#16557) |
commit | commitdiff | tree |
| 2025-10-14 |
Aman Gupta | CUDA: add fp kernel for larger batch size MoE (#16512) |
commit | commitdiff | tree |
| 2025-10-14 |
Anav Prasad | cuda : remove legacy copy-op pointer indirection code... |
commit | commitdiff | tree |
| 2025-10-14 |
Georgi Gerganov | server : dynamic token limit for prompt cache (#16560) |
commit | commitdiff | tree |
| 2025-10-13 |
Georgi Gerganov | metal : FA support F32 K and V and head size = 32 ... |
commit | commitdiff | tree |
| 2025-10-13 |
Georgi Gerganov | graph : support cacheless embeddings with FA and iSWA... |
commit | commitdiff | tree |
| 2025-10-13 |
lhez | opencl: fix build targeting CL 2 (#16554) |
commit | commitdiff | tree |
| 2025-10-13 |
Johannes Gäßler | CUDA: fix numerical issues in tile FA kernel (#16540) |
commit | commitdiff | tree |
| 2025-10-13 |
Jie Fu (傅杰) | ggml : fix build broken with -march=armv9-a on MacOS... |
commit | commitdiff | tree |
| 2025-10-13 |
Chenguang Li | CANN: fix CPU memory leak in CANN backend (#16549) |
commit | commitdiff | tree |
| 2025-10-13 |
Pascal | fix: add remark plugin to render raw HTML as literal... |
commit | commitdiff | tree |
| 2025-10-13 |
Sam/Samuel | metal: add support for opt_step_sgd (#16539) |
commit | commitdiff | tree |
| 2025-10-13 |
Georgi Gerganov | ggml : fix scalar path for computing norm (#16558) |
commit | commitdiff | tree |
| 2025-10-13 |
hipudding | CANN: Update several operators to support FP16 data... |
commit | commitdiff | tree |
| 2025-10-12 |
Sam/Samuel | metal : add opt_step_adamw and op_sum (#16529) |
commit | commitdiff | tree |
| 2025-10-12 |
Pascal | webui: remove client-side context pre-check and rely... |
commit | commitdiff | tree |
| 2025-10-12 |
Neo Zhang Jianyu | [SYCL] fix UT fault cases: count-equal, argsort, pad... |
commit | commitdiff | tree |
| 2025-10-12 |
Mathieu Baudier | ci : add Vulkan on Ubuntu with default packages build... |
commit | commitdiff | tree |
| 2025-10-12 |
Aldehir Rojas | common : handle unicode during partial json parsing... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | common : update presets (#16504) |
commit | commitdiff | tree |
| 2025-10-12 |
sirus20x6 | ggml : Fix FP16 ELU positive branch (#16519) |
commit | commitdiff | tree |
| 2025-10-12 |
Daniel Bevenius | hparams : add check for layer index in is_recurrent... |
commit | commitdiff | tree |
| 2025-10-12 |
sirus20x6 | ggml: Correct SVE implementation in ggml_vec_dot_f16_un... |
commit | commitdiff | tree |
| 2025-10-11 |
Johannes Gäßler | CUDA: faster tile FA, add oob checks, more HSs (#16492) |
commit | commitdiff | tree |
| 2025-10-11 |
Georgi Gerganov | metal : fix mul-mm condition + fix mul-mv permuted... |
commit | commitdiff | tree |
| 2025-10-11 |
Pascal | feat: render user content as markdown option (#16358) |
commit | commitdiff | tree |
| 2025-10-11 |
Yann Follet | server / ranking : add sorting and management of top_n... |
commit | commitdiff | tree |
| 2025-10-11 |
Diego Devesa | cuda : avoid initializing unused devices (#16510) |
commit | commitdiff | tree |
| 2025-10-11 |
amirai21 | convert : correctly handle LLaMA tokenizer for Jamba... |
commit | commitdiff | tree |
| 2025-10-10 |
Georgi Gerganov | server : fix division by zero when reporting stats... |
commit | commitdiff | tree |
| 2025-10-10 |
Georgi Gerganov | vocab : mark EOT token for Granite models (#16499) |
commit | commitdiff | tree |
| 2025-10-10 |
Radoslav Gerganov | server : return HTTP 400 if prompt exceeds context... |
commit | commitdiff | tree |
| 2025-10-10 |
Radoslav Gerganov | server : log requests to /v1/completions (#16495) |
commit | commitdiff | tree |
| 2025-10-10 |
Prajwal B Mehendarkar | cmake : Dont define XOPENSOURCE on AIX (#16481) |
commit | commitdiff | tree |
| 2025-10-09 |
Pascal | webui: updated the chat service to only include max_tok... |
commit | commitdiff | tree |
| 2025-10-09 |
duduta | cpu : optimize the ggml NORM operation (#15953) |
commit | commitdiff | tree |
| 2025-10-09 |
Georgi Gerganov | server : host-memory prompt caching (#16391) |
commit | commitdiff | tree |
| 2025-10-09 |
Pascal | No markdown in cot (#16483) |
commit | commitdiff | tree |
| 2025-10-09 |
Daniel Bevenius | model-conversion : add support for SentenceTransformers... |
commit | commitdiff | tree |
| 2025-10-09 |
sudhiarm | ci: add ARM64 Kleidiai build and test support (#16462) |
commit | commitdiff | tree |
| 2025-10-09 |
Chenguang Li | CANN: Improve ACL graph matching (#16166) |
commit | commitdiff | tree |
| 2025-10-09 |
Charles Xu | kleidiai: kernel interface refactoring (#16460) |
commit | commitdiff | tree |
| 2025-10-09 |
Neo Zhang Jianyu | [SYCL] refactor soft_max, add soft_max_back (#16472) |
commit | commitdiff | tree |
| 2025-10-09 |
Saba Fallah | model: EmbeddingGemma Adding Support for SentenceTransf... |
commit | commitdiff | tree |
| 2025-10-08 |
Pascal | refactor: centralize CoT parsing in backend for streami... |
commit | commitdiff | tree |
| 2025-10-08 |
ai-fonsi | Disable CUDA host buffers on integrated GPUs (#16308) |
commit | commitdiff | tree |
| 2025-10-08 |
issixx | server : fix cancel pending task (#16467) |
commit | commitdiff | tree |
| 2025-10-08 |
Georgi Gerganov | metal : mark FA blocks (#16372) |
commit | commitdiff | tree |
| 2025-10-08 |
Georgi Gerganov | server : improve context checkpoint logic (#16440) |
commit | commitdiff | tree |
| 2025-10-07 |
Reese Levine | ggml webgpu: profiling, CI updates, reworking of comman... |
commit | commitdiff | tree |
| 2025-10-07 |
Tarek Dakhran | llama : support LiquidAI LFM2-MoE hybrid model (#16464) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | server : add `/v1/health` endpoint (#16461) |
commit | commitdiff | tree |
| 2025-10-07 |
Sascha Rogmann | webui : added download action (#13552) (#16282) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | presets : fix pooling param for embedding models (... |
commit | commitdiff | tree |
| 2025-10-07 |
Radoslav Gerganov | rpc : update documentation (#16441) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | memory : use sequential equal splits for recurrent... |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | metal : add support for non-padded FA KV (#16148) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | tests : add -INF blocks to the KQ mask in the FA tests... |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | metal : various optimizations + refactoring (#16446) |
commit | commitdiff | tree |
| 2025-10-06 |
Gadflyii | llama : add --no-host to disable host buffers (#16310) |
commit | commitdiff | tree |
| 2025-10-06 |
Gabe Goodhart | chat : Granite Docling stopping (#16438) |
commit | commitdiff | tree |
| 2025-10-06 |
Sigbjørn Skjæret | ci : refactor sdk caching to minimize storage (#16414) |
commit | commitdiff | tree |
| 2025-10-06 |
Georgi Gerganov | ggml : fix unaligned access in AMX code (#16315) |
commit | commitdiff | tree |
| 2025-10-06 |
Daniel Bevenius | ci : remove missing reranker model files (#16444) |
commit | commitdiff | tree |
| 2025-10-06 |
Daniel Bevenius | ggml-cpu : fix leftover handling in ggml_vec_scale_f32... |
commit | commitdiff | tree |
| 2025-10-06 |
Yuannan | nix : removed metal for nix (#16118) |
commit | commitdiff | tree |
| 2025-10-06 |
Oleksandr Kuvshynov | server: update readme to mention n_past_max metric... |
commit | commitdiff | tree |
| 2025-10-05 |
Gabe Goodhart | model : Granite docling + Idefics3 preprocessing (SmolV... |
commit | commitdiff | tree |
| 2025-10-05 |
Reese Levine | ggml webgpu: actually add softmax, fix rms_norm offset... |
commit | commitdiff | tree |
| 2025-10-04 |
Eve | vulkan: use a more appropriate amount of threads when... |
commit | commitdiff | tree |
| 2025-10-04 |
Radoslav Gerganov | rpc : check src buffer when copying tensor (#16421) |
commit | commitdiff | tree |
| 2025-10-04 |
Radoslav Gerganov | rpc : add support for multiple devices (#16276) |
commit | commitdiff | tree |
| 2025-10-04 |
Acly | vulkan : incremental shader builds (#16341) |
commit | commitdiff | tree |
| 2025-10-03 |
Pascal | chat : support Magistral thinking (#16413) |
commit | commitdiff | tree |
| 2025-10-03 |
ddh0 | server : context checkpointing for hybrid and recurrent... |
commit | commitdiff | tree |
| 2025-10-03 |
Georgi Gerganov | metal : fix loop bound in ggml_mem_ranges (#16412) |
commit | commitdiff | tree |
| 2025-10-03 |
Sigbjørn Skjæret | llama : fix shapes for bert/mpt q/k norm (#16409) |
commit | commitdiff | tree |
| 2025-10-03 |
Acly | ggml : fix graph reallocation with multiple chunks... |
commit | commitdiff | tree |
| 2025-10-03 |
Aleksander... | Fix missing messages on sibling navigation (#16408) |
commit | commitdiff | tree |
| 2025-10-03 |
Jeff Bolz | vulkan: Replace uses of maxMemoryAllocationSize and... |
commit | commitdiff | tree |
| 2025-10-03 |
Jeff Bolz | vulkan: Fix FA coopmat1 invalid array indexing (#16365) |
commit | commitdiff | tree |
| 2025-10-03 |
Daniel Bevenius | ci : change macos-13 to macos-15-intel (#16401) |
commit | commitdiff | tree |
| 2025-10-03 |
Aleksander... | Capture model name only after first token (streaming... |
commit | commitdiff | tree |
| 2025-10-03 |
Jeff Bolz | vulkan: in flash attention, bounds check against nem1... |
commit | commitdiff | tree |
| 2025-10-03 |
Aleksander... | webui : Fix messages payload sent to chat completions... |
commit | commitdiff | tree |
| 2025-10-03 |
Pascal | fix: track viewportHeight via window.innerHeight to... |
commit | commitdiff | tree |
| 2025-10-02 |
Sigbjørn Skjæret | test-barrier : do not use more threads than physically... |
commit | commitdiff | tree |
| 2025-10-02 |
Reese Levine | ggml webgpu: add support for soft_max, optimize rms_nor... |
commit | commitdiff | tree |
| 2025-10-02 |
Piotr Wilkin... | model : Apertus model implementation (#15852) |
commit | commitdiff | tree |
| 2025-10-02 |
R0CKSTAR | musa: update compile flags (#16265) |
commit | commitdiff | tree |
| 2025-10-02 |
Sigbjørn Skjæret | ci : fix ubuntu-latest-cmake-rpc (disable ccache) ... |
commit | commitdiff | tree |
| 2025-10-02 |
Eve | ci: update vulkan ci (#16294) |
commit | commitdiff | tree |
| 2025-10-02 |
Georgi Gerganov | ci : fix clean-up of old logs (#16381) |
commit | commitdiff | tree |
| 2025-10-02 |
Neo Zhang Jianyu | SYCL: Update to oneAPI 2025.2 (#16371) |
commit | commitdiff | tree |
| 2025-10-02 |
uvos | HIP: add IMbackK to codeowner (#16375) |
commit | commitdiff | tree |
| 2025-10-01 |
uvos | CI: reenable cdna in rocm docker builds (#16376) |
commit | commitdiff | tree |
| 2025-10-01 |
uvos | HIP: Disable ROCWMMA fattn on CDNA when compiled agains... |
commit | commitdiff | tree |
| 2025-10-01 |
Shunta Saito | llama : parameter conversion and loading fixes for... |
commit | commitdiff | tree |
| next |