| 2025-10-16 |
takasurazeem | common : Update the docs on -t --threads (#16236) |
commit | commitdiff | tree |
| 2025-10-16 |
takuya kodama | ggml-cpu: replace putenv with setenv for const-correctn... |
commit | commitdiff | tree |
| 2025-10-16 |
yael-works | SYCL: Add GGML_OP_MEAN operator support (#16009) |
commit | commitdiff | tree |
| 2025-10-15 |
Aleksei Nikiforov | gguf-py : add support for endian conversion of BF16... |
commit | commitdiff | tree |
| 2025-10-15 |
safranowith | cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators... |
commit | commitdiff | tree |
| 2025-10-15 |
lhez | opencl: add q8_0 mm support (#16469) |
commit | commitdiff | tree |
| 2025-10-15 |
lhez | opencl: fix FA for f32 (#16584) |
commit | commitdiff | tree |
| 2025-10-15 |
Aleksander... | Add server-driven parameter defaults and syncing (... |
commit | commitdiff | tree |
| 2025-10-15 |
Sam/Samuel | metal: optimise `GGML_OP_SUM` (#16559) |
commit | commitdiff | tree |
| 2025-10-15 |
Georgi Gerganov | server : fix img token logs (#16595) |
commit | commitdiff | tree |
| 2025-10-15 |
Xuan-Son Nguyen | llama-quant: add support for mmproj (#16592) |
commit | commitdiff | tree |
| 2025-10-15 |
Julius Tischbein | CUDA: Changing the CUDA scheduling strategy to spin... |
commit | commitdiff | tree |
| 2025-10-15 |
Georgi Gerganov | server : fix mtmd checkpoints (#16591) |
commit | commitdiff | tree |
| 2025-10-14 |
Georgi Gerganov | metal : avoid using Metal's gpuAddress property (#16576) |
commit | commitdiff | tree |
| 2025-10-14 |
SavicStefan | vulkan: Add ACC_TYPE_VEC2 implementation (#16203) upstream/0.0.6764 |
commit | commitdiff | tree |
| 2025-10-14 |
Aman Gupta | CUDA + openCL: fix bug in accessing rms_norm->src while... |
commit | commitdiff | tree |
| 2025-10-14 |
Jeff Bolz | vulkan: Support FA with K/V in F32 (#16543) |
commit | commitdiff | tree |
| 2025-10-14 |
Jeff Bolz | vulkan: Improve build time for MSVC (#16545) |
commit | commitdiff | tree |
| 2025-10-14 |
Johannes Gäßler | CUDA: enable FA for FP32 KV cache (#16546) |
commit | commitdiff | tree |
| 2025-10-14 |
Aman Gupta | CUDA: use fastdiv + ggml_cuda_mad for mmvf (#16557) |
commit | commitdiff | tree |
| 2025-10-14 |
Aman Gupta | CUDA: add fp kernel for larger batch size MoE (#16512) |
commit | commitdiff | tree |
| 2025-10-14 |
Anav Prasad | cuda : remove legacy copy-op pointer indirection code... |
commit | commitdiff | tree |
| 2025-10-14 |
Georgi Gerganov | server : dynamic token limit for prompt cache (#16560) |
commit | commitdiff | tree |
| 2025-10-13 |
Georgi Gerganov | metal : FA support F32 K and V and head size = 32 ... |
commit | commitdiff | tree |
| 2025-10-13 |
Georgi Gerganov | graph : support cacheless embeddings with FA and iSWA... |
commit | commitdiff | tree |
| 2025-10-13 |
lhez | opencl: fix build targeting CL 2 (#16554) |
commit | commitdiff | tree |
| 2025-10-13 |
Johannes Gäßler | CUDA: fix numerical issues in tile FA kernel (#16540) |
commit | commitdiff | tree |
| 2025-10-13 |
Jie Fu (傅杰) | ggml : fix build broken with -march=armv9-a on MacOS... |
commit | commitdiff | tree |
| 2025-10-13 |
Chenguang Li | CANN: fix CPU memory leak in CANN backend (#16549) |
commit | commitdiff | tree |
| 2025-10-13 |
Pascal | fix: add remark plugin to render raw HTML as literal... |
commit | commitdiff | tree |
| 2025-10-13 |
Sam/Samuel | metal: add support for opt_step_sgd (#16539) |
commit | commitdiff | tree |
| 2025-10-13 |
Georgi Gerganov | ggml : fix scalar path for computing norm (#16558) |
commit | commitdiff | tree |
| 2025-10-13 |
hipudding | CANN: Update several operators to support FP16 data... |
commit | commitdiff | tree |
| 2025-10-12 |
Sam/Samuel | metal : add opt_step_adamw and op_sum (#16529) |
commit | commitdiff | tree |
| 2025-10-12 |
Pascal | webui: remove client-side context pre-check and rely... |
commit | commitdiff | tree |
| 2025-10-12 |
Neo Zhang Jianyu | [SYCL] fix UT fault cases: count-equal, argsort, pad... |
commit | commitdiff | tree |
| 2025-10-12 |
Mathieu Baudier | ci : add Vulkan on Ubuntu with default packages build... |
commit | commitdiff | tree |
| 2025-10-12 |
Aldehir Rojas | common : handle unicode during partial json parsing... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | common : update presets (#16504) |
commit | commitdiff | tree |
| 2025-10-12 |
sirus20x6 | ggml : Fix FP16 ELU positive branch (#16519) |
commit | commitdiff | tree |
| 2025-10-12 |
Daniel Bevenius | hparams : add check for layer index in is_recurrent... |
commit | commitdiff | tree |
| 2025-10-12 |
sirus20x6 | ggml: Correct SVE implementation in ggml_vec_dot_f16_un... |
commit | commitdiff | tree |
| 2025-10-11 |
Johannes Gäßler | CUDA: faster tile FA, add oob checks, more HSs (#16492) |
commit | commitdiff | tree |
| 2025-10-11 |
Georgi Gerganov | metal : fix mul-mm condition + fix mul-mv permuted... |
commit | commitdiff | tree |
| 2025-10-11 |
Pascal | feat: render user content as markdown option (#16358) |
commit | commitdiff | tree |
| 2025-10-11 |
Yann Follet | server / ranking : add sorting and management of top_n... |
commit | commitdiff | tree |
| 2025-10-11 |
Diego Devesa | cuda : avoid initializing unused devices (#16510) |
commit | commitdiff | tree |
| 2025-10-11 |
amirai21 | convert : correctly handle LLaMA tokenizer for Jamba... |
commit | commitdiff | tree |
| 2025-10-10 |
Georgi Gerganov | server : fix division by zero when reporting stats... |
commit | commitdiff | tree |
| 2025-10-10 |
Georgi Gerganov | vocab : mark EOT token for Granite models (#16499) |
commit | commitdiff | tree |
| 2025-10-10 |
Radoslav Gerganov | server : return HTTP 400 if prompt exceeds context... |
commit | commitdiff | tree |
| 2025-10-10 |
Radoslav Gerganov | server : log requests to /v1/completions (#16495) |
commit | commitdiff | tree |
| 2025-10-10 |
Prajwal B Mehendarkar | cmake : Dont define XOPENSOURCE on AIX (#16481) |
commit | commitdiff | tree |
| 2025-10-09 |
Pascal | webui: updated the chat service to only include max_tok... |
commit | commitdiff | tree |
| 2025-10-09 |
duduta | cpu : optimize the ggml NORM operation (#15953) |
commit | commitdiff | tree |
| 2025-10-09 |
Georgi Gerganov | server : host-memory prompt caching (#16391) |
commit | commitdiff | tree |
| 2025-10-09 |
Pascal | No markdown in cot (#16483) |
commit | commitdiff | tree |
| 2025-10-09 |
Daniel Bevenius | model-conversion : add support for SentenceTransformers... |
commit | commitdiff | tree |
| 2025-10-09 |
sudhiarm | ci: add ARM64 Kleidiai build and test support (#16462) |
commit | commitdiff | tree |
| 2025-10-09 |
Chenguang Li | CANN: Improve ACL graph matching (#16166) |
commit | commitdiff | tree |
| 2025-10-09 |
Charles Xu | kleidiai: kernel interface refactoring (#16460) |
commit | commitdiff | tree |
| 2025-10-09 |
Neo Zhang Jianyu | [SYCL] refactor soft_max, add soft_max_back (#16472) |
commit | commitdiff | tree |
| 2025-10-09 |
Saba Fallah | model: EmbeddingGemma Adding Support for SentenceTransf... |
commit | commitdiff | tree |
| 2025-10-08 |
Pascal | refactor: centralize CoT parsing in backend for streami... |
commit | commitdiff | tree |
| 2025-10-08 |
ai-fonsi | Disable CUDA host buffers on integrated GPUs (#16308) |
commit | commitdiff | tree |
| 2025-10-08 |
issixx | server : fix cancel pending task (#16467) |
commit | commitdiff | tree |
| 2025-10-08 |
Georgi Gerganov | metal : mark FA blocks (#16372) |
commit | commitdiff | tree |
| 2025-10-08 |
Georgi Gerganov | server : improve context checkpoint logic (#16440) |
commit | commitdiff | tree |
| 2025-10-07 |
Reese Levine | ggml webgpu: profiling, CI updates, reworking of comman... |
commit | commitdiff | tree |
| 2025-10-07 |
Tarek Dakhran | llama : support LiquidAI LFM2-MoE hybrid model (#16464) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | server : add `/v1/health` endpoint (#16461) |
commit | commitdiff | tree |
| 2025-10-07 |
Sascha Rogmann | webui : added download action (#13552) (#16282) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | presets : fix pooling param for embedding models (... |
commit | commitdiff | tree |
| 2025-10-07 |
Radoslav Gerganov | rpc : update documentation (#16441) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | memory : use sequential equal splits for recurrent... |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | metal : add support for non-padded FA KV (#16148) |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | tests : add -INF blocks to the KQ mask in the FA tests... |
commit | commitdiff | tree |
| 2025-10-07 |
Georgi Gerganov | metal : various optimizations + refactoring (#16446) |
commit | commitdiff | tree |
| 2025-10-06 |
Gadflyii | llama : add --no-host to disable host buffers (#16310) |
commit | commitdiff | tree |
| 2025-10-06 |
Gabe Goodhart | chat : Granite Docling stopping (#16438) |
commit | commitdiff | tree |
| 2025-10-06 |
Sigbjørn Skjæret | ci : refactor sdk caching to minimize storage (#16414) |
commit | commitdiff | tree |
| 2025-10-06 |
Georgi Gerganov | ggml : fix unaligned access in AMX code (#16315) |
commit | commitdiff | tree |
| 2025-10-06 |
Daniel Bevenius | ci : remove missing reranker model files (#16444) |
commit | commitdiff | tree |
| 2025-10-06 |
Daniel Bevenius | ggml-cpu : fix leftover handling in ggml_vec_scale_f32... |
commit | commitdiff | tree |
| 2025-10-06 |
Yuannan | nix : removed metal for nix (#16118) |
commit | commitdiff | tree |
| 2025-10-06 |
Oleksandr Kuvshynov | server: update readme to mention n_past_max metric... |
commit | commitdiff | tree |
| 2025-10-05 |
Gabe Goodhart | model : Granite docling + Idefics3 preprocessing (SmolV... |
commit | commitdiff | tree |
| 2025-10-05 |
Reese Levine | ggml webgpu: actually add softmax, fix rms_norm offset... |
commit | commitdiff | tree |
| 2025-10-04 |
Eve | vulkan: use a more appropriate amount of threads when... |
commit | commitdiff | tree |
| 2025-10-04 |
Radoslav Gerganov | rpc : check src buffer when copying tensor (#16421) |
commit | commitdiff | tree |
| 2025-10-04 |
Radoslav Gerganov | rpc : add support for multiple devices (#16276) |
commit | commitdiff | tree |
| 2025-10-04 |
Acly | vulkan : incremental shader builds (#16341) |
commit | commitdiff | tree |
| 2025-10-03 |
Pascal | chat : support Magistral thinking (#16413) |
commit | commitdiff | tree |
| 2025-10-03 |
ddh0 | server : context checkpointing for hybrid and recurrent... |
commit | commitdiff | tree |
| 2025-10-03 |
Georgi Gerganov | metal : fix loop bound in ggml_mem_ranges (#16412) |
commit | commitdiff | tree |
| 2025-10-03 |
Sigbjørn Skjæret | llama : fix shapes for bert/mpt q/k norm (#16409) |
commit | commitdiff | tree |
| 2025-10-03 |
Acly | ggml : fix graph reallocation with multiple chunks... |
commit | commitdiff | tree |
| 2025-10-03 |
Aleksander... | Fix missing messages on sibling navigation (#16408) |
commit | commitdiff | tree |
| 2025-10-03 |
Jeff Bolz | vulkan: Replace uses of maxMemoryAllocationSize and... |
commit | commitdiff | tree |
| 2025-10-03 |
Jeff Bolz | vulkan: Fix FA coopmat1 invalid array indexing (#16365) |
commit | commitdiff | tree |
| next |