2025-09-17 |
Aleksander... | SvelteKit-based WebUI (#14839) |
commit | commitdiff | tree |
2025-09-17 |
Xuan-Son Nguyen | convert : add Llama4ForCausalLM (#16042) |
commit | commitdiff | tree |
2025-09-17 |
Johannes Gäßler | CUDA: fix FA occupancy, optimize tile kernel (#15982) |
commit | commitdiff | tree |
2025-09-17 |
David Ribeiro... | common : Fix corrupted memory error on json grammar... |
commit | commitdiff | tree |
2025-09-17 |
Eve | vulkan: automatically remove unsupported devices (... |
commit | commitdiff | tree |
2025-09-17 |
Daniel Bevenius | ci : revert back to macos-13 for macOS-latest-cmake... |
commit | commitdiff | tree |
2025-09-17 |
Jie Fu (傅杰) | llama-quant : fix the verification of attention layers... |
commit | commitdiff | tree |
2025-09-17 |
Jie Fu (傅杰) | examples : support encoder-decoder models in the simple... |
commit | commitdiff | tree |
2025-09-17 |
Shane A | model : add OLMo3 support (#16015) |
commit | commitdiff | tree |
2025-09-17 |
Chenguang Li | CANN: Optimize ggml_cann_set_device (#15935) |
commit | commitdiff | tree |
2025-09-16 |
jacekpoplawski | llama-bench: add --n-cpu-moe support (#15952) |
commit | commitdiff | tree |
2025-09-16 |
Daniel Bevenius | ci : use macos-latest for arm64 webgpu build (#16029) |
commit | commitdiff | tree |
2025-09-16 |
Daniel Bevenius | ggml : fix padding in timestep embedding kernels (... |
commit | commitdiff | tree |
2025-09-16 |
Daniel Bevenius | ci : upload xcframework artifact from ios-xcode-build... |
commit | commitdiff | tree |
2025-09-16 |
Bowen Han | fix: apply clang-format to CUDA macros (#16017) |
commit | commitdiff | tree |
2025-09-16 |
Daniel Bevenius | ci : update macos-latest* jobs to use macos-latest... |
commit | commitdiff | tree |
2025-09-16 |
Yuri Khrustalev | cmake : Do not install tools on iOS targets (#15903) |
commit | commitdiff | tree |
2025-09-16 |
Aman Gupta | Add LLaDA-7b-MoE diffusion model (#16003) |
commit | commitdiff | tree |
2025-09-15 |
Jake Karnes | CUDA: fix im2col_3d to respect non-contiguous inputs... |
commit | commitdiff | tree |
2025-09-15 |
Diego Devesa | docker : enable rocWMMA in ROCm images, add gfx1151... |
commit | commitdiff | tree |
2025-09-15 |
Diego Devesa | releases : switch to rocWMMA develop branch, add gfx115... |
commit | commitdiff | tree |
2025-09-15 |
yael-works | SYCL: Add COUNT_EQUAL operator support (#15991) |
commit | commitdiff | tree |
2025-09-15 |
Nikolay Popov | llama-run: Fix model download on Windows (#15988) |
commit | commitdiff | tree |
2025-09-15 |
Aman Gupta | CUDA: some micro-optimizations in mmf.cuh for mul_mat_i... |
commit | commitdiff | tree |
2025-09-15 |
ddh0 | fix KLD percentile output (#15999) |
commit | commitdiff | tree |
2025-09-14 |
Sigbjørn Skjæret | model : add grok-2 support (#15539) |
commit | commitdiff | tree |
2025-09-14 |
Sigbjørn Skjæret | server : only attempt to enable thinking if using jinja... |
commit | commitdiff | tree |
2025-09-14 |
Georgi Gerganov | metal : remove memory pools (#15966) |
commit | commitdiff | tree |
2025-09-14 |
Adam | rocm.Dockerfile: added gfx1200,gfx1201 architectures... |
commit | commitdiff | tree |
2025-09-14 |
Ruben Ortlam | Vulkan: Clean up mul_mm shader (#15987) |
commit | commitdiff | tree |
2025-09-14 |
lcy | build: fix the build failures of Windows HIP release... |
commit | commitdiff | tree |
2025-09-14 |
Georgi Gerganov | metal : fix kernel requirements (#15983) |
commit | commitdiff | tree |
2025-09-14 |
Radoslav Gerganov | rpc : fix regression when --device is used (#15981) |
commit | commitdiff | tree |
2025-09-14 |
Diego Devesa | releases : update ROCM, add gfx1200, gfx1201, gfx1151... |
commit | commitdiff | tree |
2025-09-14 |
Radoslav Gerganov | doc : update documentation for --tensor-split (#15980) |
commit | commitdiff | tree |
2025-09-14 |
Aaron Teo | ggml-zdnn: rm user mapped buffers (#15965) |
commit | commitdiff | tree |
2025-09-13 |
Jeff Bolz | vulkan: fix failing dequant shaders (#15862) |
commit | commitdiff | tree |
2025-09-13 |
Jeff Bolz | vulkan: initialize vulkan-hpp to allow using extension... |
commit | commitdiff | tree |
2025-09-13 |
Diego Devesa | llama : allow using iGPUs with --device (#15951) |
commit | commitdiff | tree |
2025-09-13 |
Georgi Gerganov | metal : refactor kernel loading (#15964) |
commit | commitdiff | tree |
2025-09-13 |
Georgi Gerganov | metal : allow ops to run concurrently (#15929) |
commit | commitdiff | tree |
2025-09-13 |
Georgi Gerganov | metal : fix memory leaks (#15962) |
commit | commitdiff | tree |
2025-09-12 |
Aaron Teo | ggml-zdnn: fix #15414, activate FP16 and BF16 accelerat... |
commit | commitdiff | tree |
2025-09-12 |
Eric Curtin | Add docker protocol support for llama-server model... |
commit | commitdiff | tree |
2025-09-12 |
Haiyue Wang | context : remove redundant explicit casting to the... |
commit | commitdiff | tree |
2025-09-12 |
Georgi Gerganov | server : adjust prompt similarity thold + add logs... |
commit | commitdiff | tree |
2025-09-12 |
Ruben Ortlam | Vulkan iGPU device selection overhaul and PCI ID API... |
commit | commitdiff | tree |
2025-09-12 |
Mathieu Baudier | vulkan: Make device memory check more portable (#15939) |
commit | commitdiff | tree |
2025-09-12 |
Neo Zhang Jianyu | Revert "sycl: add usage of enqueue_functions extension... |
commit | commitdiff | tree |
2025-09-11 |
Diego Devesa | ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device... |
commit | commitdiff | tree |
2025-09-11 |
Johannes Gäßler | CUDA: larger SRAM reads for tile FA, AMD FP16 dot ... |
commit | commitdiff | tree |
2025-09-11 |
ddh0 | nitpick : correct MB to MiB (#15934) |
commit | commitdiff | tree |
2025-09-11 |
Daniel Bevenius | ggml-cpu : add check for ARM MATMUL_INT8/i8mm support... |
commit | commitdiff | tree |
2025-09-11 |
Charles Xu | kleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed... |
commit | commitdiff | tree |
2025-09-11 |
hipudding | CANN: Disable acl_graph for prefill stage (#15933) |
commit | commitdiff | tree |
2025-09-10 |
Oliver Simons | CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3%... |
commit | commitdiff | tree |
2025-09-10 |
Jie Fu (傅杰) | llama : support T5 models with unequal number of encode... |
commit | commitdiff | tree |
2025-09-10 |
Sigbjørn Skjæret | graph : support non-contiguous Q in build_attn_mha... |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | ggml-cpu : fix padding in ggml_timestep_embedding ... |
commit | commitdiff | tree |
2025-09-10 |
Georgi Gerganov | metal : make the backend async (#15906) |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | ci : add caching for ROCm installation in release workf... |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | tests : filter out no-ops from coverage report (#15900) |
commit | commitdiff | tree |
2025-09-10 |
j-k | media : add transparent icon svg and png [no ci] (... |
commit | commitdiff | tree |
2025-09-10 |
Jesse | gitignore : Ignore vim swap files in tests (#15901) |
commit | commitdiff | tree |
2025-09-10 |
Chenguang Li | CANN: Add ROPE sin/cos cache for reuse (#15912) |
commit | commitdiff | tree |
2025-09-10 |
Chenguang Li | CANN: implement LRU cache for ACL graphs (#15814) |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | llama : check returned fn ptrs from ggml_backend_reg_ge... |
commit | commitdiff | tree |
2025-09-10 |
Daniel Bevenius | ci : cache ROCm installation in windows-latest-cmake... |
commit | commitdiff | tree |
2025-09-09 |
Ruben Ortlam | vulkan: throw the oom error instead of no memory type... |
commit | commitdiff | tree |
2025-09-09 |
Jeff Bolz | vulkan: Fix OOB accesses in soft_max_back (#15861) |
commit | commitdiff | tree |
2025-09-09 |
Johannes Gäßler | HIP: use v_dot2_f32_f16 instruction for FA (#15884) |
commit | commitdiff | tree |
2025-09-09 |
lksj92hs | Workaround for subgroup arithmetic failing on MoltenVK... |
commit | commitdiff | tree |
2025-09-09 |
Aman Gupta | CUDA: Add mul_mat_id support for the mmf kernel (#15767) |
commit | commitdiff | tree |
2025-09-09 |
Johannes Gäßler | CUDA: fix GET_ROWS for large tensors (#15882) |
commit | commitdiff | tree |
2025-09-09 |
Georgi Gerganov | contrib : add notes about merging PRs (#15881) |
commit | commitdiff | tree |
2025-09-09 |
Daniel Bevenius | requirements : update transformers/torch for Embedding... |
commit | commitdiff | tree |
2025-09-09 |
Piotr Wilkin... | model-conversion : add extra debugging support for... |
commit | commitdiff | tree |
2025-09-08 |
Aldehir Rojas | json : support `enum` values within `allOf` (#15830) |
commit | commitdiff | tree |
2025-09-08 |
j-k | media : add llama1 icon (#15878) |
commit | commitdiff | tree |
2025-09-08 |
Jeff Bolz | vulkan: sort graph to allow more parallel execution... |
commit | commitdiff | tree |
2025-09-08 |
Aman Gupta | CUDA: generate_cu_files.py - add missing mxfp4 (#15880) |
commit | commitdiff | tree |
2025-09-08 |
Jesse | chat : Deepseek V3.1 reasoning and tool calling support... |
commit | commitdiff | tree |
2025-09-08 |
Xuan-Son Nguyen | server : bring back timings_per_token (#15879) |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | cuda : fix supports_op condition for get_rows when... |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | metal : refactor + optimize (#15857) |
commit | commitdiff | tree |
2025-09-08 |
Xuan-Son Nguyen | ggml: allow casting between f32 and i32 (#15783) |
commit | commitdiff | tree |
2025-09-08 |
Sigbjørn Skjæret | CUDA: non-contiguous src0 not supported for PAD (#15869) |
commit | commitdiff | tree |
2025-09-08 |
Daniel Bevenius | convert : force setting sliding_window from original... |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | batched-bench : fix llama_synchronize usage during... |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | context : fix n_outputs during reserve (#15858) |
commit | commitdiff | tree |
2025-09-08 |
Georgi Gerganov | model : avoid ggml_cont_3d for fused QKV weights (... |
commit | commitdiff | tree |
2025-09-08 |
Jeff Bolz | tests: large sizes for get_rows (#15687) |
commit | commitdiff | tree |
2025-09-08 |
Chenguang Li | CANN: Stream sync between devices for acl_graph (#15809) |
commit | commitdiff | tree |
2025-09-07 |
Jeff Bolz | vulkan: support im2col_3d (#15795) |
commit | commitdiff | tree |
2025-09-07 |
Aaron Teo | ggml-cpu: clean up s390x SIMD (#15855) |
commit | commitdiff | tree |
2025-09-07 |
Jeff Bolz | vulkan: Support pad_ext (#15794) |
commit | commitdiff | tree |
2025-09-07 |
Jeff Bolz | vulkan: Use larger loads in scalar/coopmat1 matmul... |
commit | commitdiff | tree |
2025-09-07 |
Daniel Bevenius | ggml WebGPU: remove userdata from request adapter callb... |
commit | commitdiff | tree |
2025-09-06 |
Johannes Gäßler | CUDA: faster tile FA (Pascal/AMD), headsize 256 (#15769) |
commit | commitdiff | tree |
2025-09-06 |
Charles Xu | kleidiai: generalize compute_forward_kv_cache to comput... |
commit | commitdiff | tree |
next |