| 2026-03-16 |
MoonShadow | ggml/hip: fix APU compatibility - soft error handling... |
commit | commitdiff | tree |
| 2026-03-16 |
Bartowski | ggml : guard against sumq2 being 0 in IQ4_NL (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
PikaPikachu | cuda : add RDNA4-specific MMVQ parameter table for... |
commit | commitdiff | tree |
| 2026-03-16 |
Ruben Ortlam | vulkan: use graphics queue on AMD (llama/20551) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : add FA specialization for HSK = 320, HSV =... |
commit | commitdiff | tree |
| 2026-03-16 |
Max Krasnyansky | hexagon: Q4_0 and MXFP4 repack fixes (llama/20527) |
commit | commitdiff | tree |
| 2026-03-16 |
Neo Zhang | add op gated_delta_net (llama/20455) |
commit | commitdiff | tree |
| 2026-03-16 |
Adrien Gallouët | ggml : add native AVX512-FP16 support for F16 operation... |
commit | commitdiff | tree |
| 2026-03-16 |
Wallentri | Use fp32 in cuBLAS V100 to avoid overflows, env variabl... |
commit | commitdiff | tree |
| 2026-03-16 |
Zijun Yu | ggml : add OpenVINO backend (llama/15307) |
commit | commitdiff | tree |
| 2026-03-16 |
Rail Chabdarov | Fix data race in CUDA's "cpy" kernel (influences GGML... |
commit | commitdiff | tree |
| 2026-03-16 |
lhez | opencl: fix l2_norm (llama/20480) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | graph : remove redundant GDN state transposes (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
rehan-10xengineer | ggml-cpu: add RVV vec dot kernels for quantization... |
commit | commitdiff | tree |
| 2026-03-16 |
Adrien Gallouët | ggml : fix typo gmml (llama/20512) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : fix l2 norm scale (llama/20493) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | llama : disable graph reuse with pipeline parallelism... |
commit | commitdiff | tree |
| 2026-03-16 |
ProgenyAlpha | vulkan: add GATED_DELTA_NET op support (llama/20334) |
commit | commitdiff | tree |
| 2026-03-16 |
ProgenyAlpha | vulkan: fix SSM_CONV PP scaling with large ubatch sizes... |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : avoid divisions in bin kernel (llama/20426) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2026-03-16 |
Jeff Bolz | vulkan: fix l2_norm epsilon handling (llama/20350) |
commit | commitdiff | tree |
| 2026-03-16 |
Jeff Bolz | vulkan: fix OOB check in flash_attn_mask_opt (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Masato Nakasaka | vulkan: Fix ErrorOutOfHostMemory on Intel GPU when... |
commit | commitdiff | tree |
| 2026-03-16 |
lhez | opencl: use larger workgroup size for get_rows (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
shaofeiqi | opencl: add cumsum op (llama/18981) |
commit | commitdiff | tree |
| 2026-03-16 |
uvos | hip: compile debug builds with -O2 on hip to avoid... |
commit | commitdiff | tree |
| 2026-03-16 |
Masashi Yoshimura | ggml-webgpu: Add supports for `GGML_OP_REPEAT` (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | llama : enable chunked fused GDN path (llama/20340) |
commit | commitdiff | tree |
| 2026-03-16 |
Richard Davison | ggml : add NVFP4 quantization type support (llama/19769) |
commit | commitdiff | tree |
| 2026-03-16 |
Daniel Bevenius | llama : add support for Nemotron 3 Super (llama/20411) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : fix capture_compute counter logic (llama/20410) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : fix q5_k mul_mv register spill (llama/20399) |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : add env var to trigger graph capture (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
uvos | ggml-cuda: gdn use shared mem for HIP (llama/20366) |
commit | commitdiff | tree |
| 2026-03-16 |
uvos | cuda/hip: fix loop unrolling in ssm-conv (llama/20369) |
commit | commitdiff | tree |
| 2026-03-16 |
Neo Zhang | fix op rope, add rope_back (llama/20293) |
commit | commitdiff | tree |
| 2026-03-16 |
Neo Zhang | fix for failed UT case: ACC, L2_NORM, UPSCALE, fused_gl... |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | ggml : bump RPC version (llama/20330) |
commit | commitdiff | tree |
| 2026-03-16 |
Reese Levine | ggml webgpu: faster normal quant and some k-quant matri... |
commit | commitdiff | tree |
| 2026-03-16 |
Charles Xu | kleidiai : support for concurrent sme and neon kernel... |
commit | commitdiff | tree |
| 2026-03-16 |
Taimur Ahmad | ggml-cpu: add RVV repack GEMM and GEMV for quantization... |
commit | commitdiff | tree |
| 2026-03-16 |
Julian Pscheid | metal: handle command buffer failures gracefully in... |
commit | commitdiff | tree |
| 2026-03-16 |
Paul Flynn | metal : extend mul_mv_ext to BF16, Q2_K, Q3_K (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Georgi Gerganov | metal : add upscale (llama/20284) |
commit | commitdiff | tree |
| 2026-03-16 |
Aman Gupta | ggml-cuda: disable gdn for musa (llama/20278) |
commit | commitdiff | tree |
| 2026-03-16 |
Bertay Eren | ggml-vulkan: add SGN operator, auto-generate Vulkan... |
commit | commitdiff | tree |
| 2026-03-16 |
Ruben Ortlam | vulkan: skip zero size tensors in backend copies (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Michael Huang | cuda : display total and free VRAM capacity during... |
commit | commitdiff | tree |
| 2026-03-16 |
GiantPrince | ggml-vulkan: Add ELU op support (llama/20183) |
commit | commitdiff | tree |
| 2026-03-16 |
Jeff Bolz | vulkan: Fix data races in coopmat1 mul_mat(_id) (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Neo Zhang | supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Aman Gupta | ggml: add GATED_DELTA_NET op (llama/19504) |
commit | commitdiff | tree |
| 2026-03-16 |
lhez | opencl: add l2_norm (llama/20160) |
commit | commitdiff | tree |
| 2026-03-16 |
Bartowski | quants : Add memsets and other fixes for IQ quants... |
commit | commitdiff | tree |
| 2026-03-16 |
Todor Boinovski | hexagon: add f32 ssm_conv op (llama/20122) |
commit | commitdiff | tree |
| 2026-03-16 |
Max Krasnyansky | cpu: skip redudant ROPE cache updates (llama/20149) |
commit | commitdiff | tree |
| 2026-03-16 |
Aman Gupta | ggml-cuda: add mem check for fusion (llama/19916) |
commit | commitdiff | tree |
| 2026-03-16 |
Aaron Teo | ggml: update comments for backends which have no memory... |
commit | commitdiff | tree |
| 2026-03-16 |
shalinib-ibm | ggml-cpu: Fix gcc 15 ICE on ppc64le (ggml/20083) (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Aman Gupta | CUDA: use shared mem for ssm_conv (llama/20128) |
commit | commitdiff | tree |
| 2026-03-16 |
Johannes Gäßler | ggml-cpu: fix data race for debug asserts (llama/20148) |
commit | commitdiff | tree |
| 2026-03-16 |
lhez | opencl: add neg, exp and diag (llama/20127) |
commit | commitdiff | tree |
| 2026-03-16 |
YardenTal44 | hexagon: add fp16 support for binary ops: add,sub,mul... |
commit | commitdiff | tree |
| 2026-03-16 |
Andreas Kieslinger | CUDA: Improve performance via less synchronizations... |
commit | commitdiff | tree |
| 2026-03-16 |
Marcel Petrick | chore : correct typos [no ci] (llama/20041) |
commit | commitdiff | tree |
| 2026-03-16 |
Max Krasnyansky | hexagon: Flash Attention optimizations (dma, mpyacc... |
commit | commitdiff | tree |
| 2026-03-16 |
lhez | opencl: add `SET`, support i32 for `CPY`, minor refacto... |
commit | commitdiff | tree |
| 2026-03-16 |
Nikhil Jain | Fix wait logic for inflight jobs (llama/20096) |
commit | commitdiff | tree |
| 2026-03-16 |
Masashi Yoshimura | Add concat op to webgpu. (llama/20068) |
commit | commitdiff | tree |
| 2026-03-16 |
Johannes Gäßler | ggml: fix ggml_is_contiguous_n for ne == 1 (llama/20092) |
commit | commitdiff | tree |
| 2026-03-16 |
Adrien Gallouët | ggml : use a simple std::thread in AMX without OpenMP... |
commit | commitdiff | tree |
| 2026-03-16 |
Charles Xu | kleidiai : add sme fp16 compute path for q4_0 gemm... |
commit | commitdiff | tree |
| 2026-03-16 |
shaofeiqi | opencl: add optimized q4_1 mm kernel for adreno (llama... |
commit | commitdiff | tree |
| 2026-03-16 |
Abhijit Ramesh | ggml webgpu: fix workgroup dispatch limit for large... |
commit | commitdiff | tree |
| 2026-03-16 |
Nikhil Jain | ggml webgpu: Clean up per-thread parameter buffer pool... |
commit | commitdiff | tree |
| 2026-03-16 |
Masashi Yoshimura | ggml-webgpu: Support non-contiguous `src0` and overlapp... |
commit | commitdiff | tree |
| 2026-03-16 |
Ruben Ortlam | vulkan: tune MMVQ for Intel Windows (llama/19988) |
commit | commitdiff | tree |
| 2026-03-16 |
Aaron Teo | ggml-cpu: optimise s390x multiply extend instructions... |
commit | commitdiff | tree |
| 2026-03-16 |
Ruben Ortlam | vulkan: improve partial offloading performance on AMD... |
commit | commitdiff | tree |
| 2026-03-16 |
oobabooga | cuda: cap grid.y at 65535 in non-contiguous dequantize... |
commit | commitdiff | tree |
| 2026-03-16 |
Jayant Lohia | CUDA: add CDNA3 MFMA support for flash attention MMA... |
commit | commitdiff | tree |
| 2026-03-16 |
Aman Gupta | ggml-cpu: add repack for mxfp4 (llama/19738) |
commit | commitdiff | tree |
| 2026-03-05 |
KITAITI Makoto | ruby : null-check (#3689) |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | gguf : sync (ggml/0) |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | scripts : sync gguf |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2026-02-27 |
Neo Zhang | replace the magic nunber 768 by max work group size... |
commit | commitdiff | tree |
| 2026-02-27 |
Vishal Singh | ggml-zendnn: update code for latest ZenDNN API (llama... |
commit | commitdiff | tree |
| 2026-02-27 |
Adrien Gallouët | ggml : fix AMX and add batched support (llama/19925) |
commit | commitdiff | tree |
| 2026-02-27 |
Ruben Ortlam | vulkan: fix fp16 Flash Attention on Windows AMD RDNA2... |
commit | commitdiff | tree |
| 2026-02-27 |
Kevin Pouget | ggml-virtgpu: improve the reliability of the code ... |
commit | commitdiff | tree |
| 2026-02-27 |
Neo Zhang | support permuted, remove check s0/s10 (llama/19889) |
commit | commitdiff | tree |
| 2026-02-27 |
Jeff Bolz | vulkan: check for memory overlap before doing fusion... |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | ggml/gguf : prevent integer overflows (llama/19856) |
commit | commitdiff | tree |
| 2026-02-27 |
Ruben Ortlam | Vulkan Scalar Flash Attention Refactor (llama/19625) |
commit | commitdiff | tree |
| 2026-02-27 |
Jeff Bolz | vulkan: fix coopmat1 without bf16 support (llama/19793) |
commit | commitdiff | tree |
| 2026-02-27 |
Jeff Bolz | vulkan: fix data race in mul_mat_id shader (llama/19790) |
commit | commitdiff | tree |
| next |