2025-09-15 |
Diego Devesa | releases : switch to rocWMMA develop branch, add gfx115... |
commit | commitdiff | tree |
2025-09-15 |
yael-works | SYCL: Add COUNT_EQUAL operator support (#15991) |
commit | commitdiff | tree |
2025-09-15 |
Nikolay Popov | llama-run: Fix model download on Windows (#15988) |
commit | commitdiff | tree |
2025-09-15 |
Aman Gupta | CUDA: some micro-optimizations in mmf.cuh for mul_mat_i... |
commit | commitdiff | tree |
2025-09-15 |
ddh0 | fix KLD percentile output (#15999) |
commit | commitdiff | tree |
2025-09-14 |
Sigbjørn Skjæret | model : add grok-2 support (#15539) |
commit | commitdiff | tree |
2025-09-14 |
Sigbjørn Skjæret | server : only attempt to enable thinking if using jinja... |
commit | commitdiff | tree |
2025-09-14 |
Georgi Gerganov | metal : remove memory pools (#15966) |
commit | commitdiff | tree |
2025-09-14 |
Adam | rocm.Dockerfile: added gfx1200,gfx1201 architectures... |
commit | commitdiff | tree |
2025-09-14 |
Ruben Ortlam | Vulkan: Clean up mul_mm shader (#15987) |
commit | commitdiff | tree |
2025-09-14 |
lcy | build: fix the build failures of Windows HIP release... |
commit | commitdiff | tree |
2025-09-14 |
Georgi Gerganov | metal : fix kernel requirements (#15983) |
commit | commitdiff | tree |
2025-09-14 |
Radoslav Gerganov | rpc : fix regression when --device is used (#15981) |
commit | commitdiff | tree |
2025-09-14 |
Diego Devesa | releases : update ROCM, add gfx1200, gfx1201, gfx1151... |
commit | commitdiff | tree |
2025-09-14 |
Radoslav Gerganov | doc : update documentation for --tensor-split (#15980) |
commit | commitdiff | tree |
2025-09-14 |
Aaron Teo | ggml-zdnn: rm user mapped buffers (#15965) |
commit | commitdiff | tree |
2025-09-13 |
Jeff Bolz | vulkan: fix failing dequant shaders (#15862) |
commit | commitdiff | tree |
2025-09-13 |
Jeff Bolz | vulkan: initialize vulkan-hpp to allow using extension... |
commit | commitdiff | tree |
2025-09-13 |
Diego Devesa | llama : allow using iGPUs with --device (#15951) |
commit | commitdiff | tree |
2025-09-13 |
Georgi Gerganov | metal : refactor kernel loading (#15964) |
commit | commitdiff | tree |
2025-09-13 |
Georgi Gerganov | metal : allow ops to run concurrently (#15929) |
commit | commitdiff | tree |
2025-09-13 |
Georgi Gerganov | metal : fix memory leaks (#15962) |
commit | commitdiff | tree |
2025-09-12 |
Aaron Teo | ggml-zdnn: fix #15414, activate FP16 and BF16 accelerat... |
commit | commitdiff | tree |
2025-09-12 |
Eric Curtin | Add docker protocol support for llama-server model... |
commit | commitdiff | tree |
2025-09-12 |
Haiyue Wang | context : remove redundant explicit casting to the... |
commit | commitdiff | tree |
2025-09-12 |
Georgi Gerganov | server : adjust prompt similarity thold + add logs... |
commit | commitdiff | tree |
2025-09-12 |
Ruben Ortlam | Vulkan iGPU device selection overhaul and PCI ID API... |
commit | commitdiff | tree |
2025-09-12 |
Mathieu Baudier | vulkan: Make device memory check more portable (#15939) |
commit | commitdiff | tree |
2025-09-12 |
Neo Zhang Jianyu | Revert "sycl: add usage of enqueue_functions extension... |
commit | commitdiff | tree |
2025-09-11 |
Diego Devesa | ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device... |
commit | commitdiff | tree |
2025-09-11 |
Johannes Gäßler | CUDA: larger SRAM reads for tile FA, AMD FP16 dot ... |
commit | commitdiff | tree |
2025-09-11 |
ddh0 | nitpick : correct MB to MiB (#15934) |
commit | commitdiff | tree |
2025-09-11 |
Daniel Bevenius | ggml-cpu : add check for ARM MATMUL_INT8/i8mm support... |
commit | commitdiff | tree |
2025-09-11 |
Charles Xu | kleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed... |
commit | commitdiff | tree |
2025-09-11 |
hipudding | CANN: Disable acl_graph for prefill stage (#15933) |
commit | commitdiff | tree |
2025-09-10 |
Oliver Simons | CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3%... |
commit | commitdiff | tree |
2025-09-10 |
Jie Fu (傅杰) | llama : support T5 models with unequal number of encode... |
commit | commitdiff | tree |
2025-09-10 |
Sigbjørn Skjæret | graph : support non-contiguous Q in build_attn_mha... |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | ggml-cpu : fix padding in ggml_timestep_embedding ... |
commit | commitdiff | tree |
2025-09-10 |
Georgi Gerganov | metal : make the backend async (#15906) |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | ci : add caching for ROCm installation in release workf... |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | tests : filter out no-ops from coverage report (#15900) |
commit | commitdiff | tree |
2025-09-10 |
j-k | media : add transparent icon svg and png [no ci] (... |
commit | commitdiff | tree |
2025-09-10 |
Jesse | gitignore : Ignore vim swap files in tests (#15901) |
commit | commitdiff | tree |
2025-09-10 |
Chenguang Li | CANN: Add ROPE sin/cos cache for reuse (#15912) |
commit | commitdiff | tree |
2025-09-10 |
Chenguang Li | CANN: implement LRU cache for ACL graphs (#15814) |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | llama : check returned fn ptrs from ggml_backend_reg_ge... |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | ci : cache ROCm installation in windows-latest-cmake... |
commit | commitdiff | tree |
2025-09-09 |
Ruben Ortlam | vulkan: throw the oom error instead of no memory type... |
commit | commitdiff | tree |
2025-09-09 |
Jeff Bolz | vulkan: Fix OOB accesses in soft_max_back (#15861) |
commit | commitdiff | tree |
2025-09-09 |
Johannes Gäßler | HIP: use v_dot2_f32_f16 instruction for FA (#15884) |
commit | commitdiff | tree |
2025-09-09 |
lksj92hs | Workaround for subgroup arithmetic failing on MoltenVK... |
commit | commitdiff | tree |
2025-09-09 |
Aman Gupta | CUDA: Add mul_mat_id support for the mmf kernel (#15767) |
commit | commitdiff | tree |
2025-09-09 |
Johannes Gäßler | CUDA: fix GET_ROWS for large tensors (#15882) |
commit | commitdiff | tree |
2025-09-09 |
Georgi Gerganov | contrib : add notes about merging PRs (#15881) |
commit | commitdiff | tree |
2025-09-09 |
Daniel Bevenius | requirements : update transformers/torch for Embedding... |
commit | commitdiff | tree |
2025-09-09 |
Piotr Wilkin... | model-conversion : add extra debugging support for... |
commit | commitdiff | tree |
2025-09-08 |
Aldehir Rojas | json : support `enum` values within `allOf` (#15830) |
commit | commitdiff | tree |
2025-09-08 |
j-k | media : add llama1 icon (#15878) |
commit | commitdiff | tree |
2025-09-08 |
Jeff Bolz | vulkan: sort graph to allow more parallel execution... |
commit | commitdiff | tree |
2025-09-08 |
Aman Gupta | CUDA: generate_cu_files.py - add missing mxfp4 (#15880) |
commit | commitdiff | tree |
2025-09-08 |
Jesse | chat : Deepseek V3.1 reasoning and tool calling support... |
commit | commitdiff | tree |
2025-09-08 |
Xuan-Son Nguyen | server : bring back timings_per_token (#15879) |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | cuda : fix supports_op condition for get_rows when... |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | metal : refactor + optimize (#15857) |
commit | commitdiff | tree |
2025-09-08 |
Xuan-Son Nguyen | ggml: allow casting between f32 and i32 (#15783) |
commit | commitdiff | tree |
2025-09-08 |
Sigbjørn Skjæret | CUDA: non-contiguous src0 not supported for PAD (#15869) |
commit | commitdiff | tree |
2025-09-08 |
Daniel Bevenius | convert : force setting sliding_window from original... |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | batched-bench : fix llama_synchronize usage during... |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | context : fix n_outputs during reserve (#15858) |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | model : avoid ggml_cont_3d for fused QKV weights (... |
commit | commitdiff | tree |
2025-09-08 |
Jeff Bolz | tests: large sizes for get_rows (#15687) |
commit | commitdiff | tree |
2025-09-08 |
Chenguang Li | CANN: Stream sync between devices for acl_graph (#15809) |
commit | commitdiff | tree |
2025-09-07 |
Jeff Bolz | vulkan: support im2col_3d (#15795) |
commit | commitdiff | tree |
2025-09-07 |
Aaron Teo | ggml-cpu: clean up s390x SIMD (#15855) |
commit | commitdiff | tree |
2025-09-07 |
Jeff Bolz | vulkan: Support pad_ext (#15794) |
commit | commitdiff | tree |
2025-09-07 |
Jeff Bolz | vulkan: Use larger loads in scalar/coopmat1 matmul... |
commit | commitdiff | tree |
2025-09-07 |
Daniel Bevenius | ggml WebGPU: remove userdata from request adapter callb... |
commit | commitdiff | tree |
2025-09-06 |
Johannes Gäßler | CUDA: faster tile FA (Pascal/AMD), headsize 256 (#15769) |
commit | commitdiff | tree |
2025-09-06 |
Charles Xu | kleidiai: generalize compute_forward_kv_cache to comput... |
commit | commitdiff | tree |
2025-09-06 |
Xuan-Son Nguyen | server : speed up tests (#15836) |
commit | commitdiff | tree |
2025-09-06 |
Xuan-Son Nguyen | server : implement prompt processing progress report... |
commit | commitdiff | tree |
2025-09-06 |
Johannes Gäßler | ggml-cpu: document use of "free" memory [no ci] (#15834) |
commit | commitdiff | tree |
2025-09-06 |
Aaron Teo | ggml-cpu: drop support for nnpa intrinsics (#15821) |
commit | commitdiff | tree |
2025-09-05 |
Gabe Goodhart | aLoRA Support (#15327) |
commit | commitdiff | tree |
2025-09-05 |
Sigbjørn Skjæret | ci : exempt correct research label (#15825) |
commit | commitdiff | tree |
2025-09-05 |
Gabe Goodhart | Thinking model disabled assistant prefill (#15404) |
commit | commitdiff | tree |
2025-09-05 |
Eric Curtin | Implement --log-colors with always/never/auto (#15792) |
commit | commitdiff | tree |
2025-09-05 |
Johannes Gäßler | CUDA: fastdiv, launch bounds for mmvq + q8_1 quant... |
commit | commitdiff | tree |
2025-09-05 |
Daniel Bevenius | tests : add --list-ops and --show-coverage options... |
commit | commitdiff | tree |
2025-09-05 |
Erik Scholz | gguf: gguf_writer refactor (#15691) |
commit | commitdiff | tree |
2025-09-05 |
Georgi Gerganov | kv-cache : fix SWA checks + disable cacheless iSWA... |
commit | commitdiff | tree |
2025-09-05 |
Daniel Bevenius | model-conversion : add --embeddings flag to modelcard... |
commit | commitdiff | tree |
2025-09-04 |
ExtReMLapin | chat : fixed crash when Hermes 2 <tool_call> had a... |
commit | commitdiff | tree |
2025-09-04 |
Piotr Wilkin... | chat : nemotron thinking & toolcalling support (#15676) |
commit | commitdiff | tree |
2025-09-04 |
Piotr Wilkin... | scripts : add Jinja tester PySide6 simple app (#15756) |
commit | commitdiff | tree |
2025-09-04 |
Daniel Bevenius | llama : add support for EmbeddingGemma 300m (#15798) |
commit | commitdiff | tree |
2025-09-04 |
Gabe Goodhart | metal : Add template specialization for mul_mm_id w... |
commit | commitdiff | tree |
2025-09-04 |
Daniel Bevenius | llama : set n_outputs to 1 to avoid 0 outputs mean... |
commit | commitdiff | tree |
2025-09-04 |
Chenguang Li | CANN: Refactor ND to NZ workspace to be per-device... |
commit | commitdiff | tree |
next |