| 2026-03-15 |
Bertay Eren | ggml-vulkan: add SGN operator, auto-generate Vulkan... |
commit | commitdiff | tree |
| 2026-03-15 |
Ruben Ortlam | vulkan: skip zero size tensors in backend copies (llama... |
commit | commitdiff | tree |
| 2026-03-15 |
Michael Huang | cuda : display total and free VRAM capacity during... |
commit | commitdiff | tree |
| 2026-03-15 |
GiantPrince | ggml-vulkan: Add ELU op support (llama/20183) |
commit | commitdiff | tree |
| 2026-03-15 |
Jeff Bolz | vulkan: Fix data races in coopmat1 mul_mat(_id) (llama... |
commit | commitdiff | tree |
| 2026-03-15 |
Neo Zhang | supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama... |
commit | commitdiff | tree |
| 2026-03-15 |
Aman Gupta | ggml: add GATED_DELTA_NET op (llama/19504) |
commit | commitdiff | tree |
| 2026-03-15 |
lhez | opencl: add l2_norm (llama/20160) |
commit | commitdiff | tree |
| 2026-03-15 |
Bartowski | quants : Add memsets and other fixes for IQ quants... |
commit | commitdiff | tree |
| 2026-03-15 |
Piotr Wilkin... | Autoparser - complete refactoring of parser architectur... |
commit | commitdiff | tree |
| 2026-03-15 |
Todor Boinovski | hexagon: add f32 ssm_conv op (llama/20122) |
commit | commitdiff | tree |
| 2026-03-15 |
Max Krasnyansky | cpu: skip redudant ROPE cache updates (llama/20149) |
commit | commitdiff | tree |
| 2026-03-15 |
Aman Gupta | ggml-cuda: add mem check for fusion (llama/19916) |
commit | commitdiff | tree |
| 2026-03-15 |
Aaron Teo | ggml: update comments for backends which have no memory... |
commit | commitdiff | tree |
| 2026-03-15 |
shalinib-ibm | ggml-cpu: Fix gcc 15 ICE on ppc64le (#20083) (llama... |
commit | commitdiff | tree |
| 2026-03-15 |
Aman Gupta | CUDA: use shared mem for ssm_conv (llama/20128) |
commit | commitdiff | tree |
| 2026-03-15 |
Johannes Gäßler | ggml-cpu: fix data race for debug asserts (llama/20148) |
commit | commitdiff | tree |
| 2026-03-15 |
lhez | opencl: add neg, exp and diag (llama/20127) |
commit | commitdiff | tree |
| 2026-03-15 |
YardenTal44 | hexagon: add fp16 support for binary ops: add,sub,mul... |
commit | commitdiff | tree |
| 2026-03-15 |
Andreas Kieslinger | CUDA: Improve performance via less synchronizations... |
commit | commitdiff | tree |
| 2026-03-15 |
Marcel Petrick | chore : correct typos [no ci] (llama/20041) |
commit | commitdiff | tree |
| 2026-03-15 |
Max Krasnyansky | hexagon: Flash Attention optimizations (dma, mpyacc... |
commit | commitdiff | tree |
| 2026-03-15 |
lhez | opencl: add `SET`, support i32 for `CPY`, minor refacto... |
commit | commitdiff | tree |
| 2026-03-15 |
Nikhil Jain | Fix wait logic for inflight jobs (llama/20096) |
commit | commitdiff | tree |
| 2026-03-15 |
Masashi Yoshimura | Add concat op to webgpu. (llama/20068) |
commit | commitdiff | tree |
| 2026-03-15 |
Johannes Gäßler | ggml: fix ggml_is_contiguous_n for ne == 1 (llama/20092) |
commit | commitdiff | tree |
| 2026-03-15 |
Adrien Gallouët | ggml : use a simple std::thread in AMX without OpenMP... |
commit | commitdiff | tree |
| 2026-03-15 |
Charles Xu | kleidiai : add sme fp16 compute path for q4_0 gemm... |
commit | commitdiff | tree |
| 2026-03-15 |
shaofeiqi | opencl: add optimized q4_1 mm kernel for adreno (llama... |
commit | commitdiff | tree |
| 2026-03-15 |
Abhijit Ramesh | ggml webgpu: fix workgroup dispatch limit for large... |
commit | commitdiff | tree |
| 2026-03-15 |
Nikhil Jain | ggml webgpu: Clean up per-thread parameter buffer pool... |
commit | commitdiff | tree |
| 2026-03-15 |
Masashi Yoshimura | ggml-webgpu: Support non-contiguous `src0` and overlapp... |
commit | commitdiff | tree |
| 2026-03-15 |
Ruben Ortlam | vulkan: tune MMVQ for Intel Windows (llama/19988) |
commit | commitdiff | tree |
| 2026-03-15 |
Aaron Teo | ggml-cpu: optimise s390x multiply extend instructions... |
commit | commitdiff | tree |
| 2026-03-15 |
Ruben Ortlam | vulkan: improve partial offloading performance on AMD... |
commit | commitdiff | tree |
| 2026-03-15 |
oobabooga | cuda: cap grid.y at 65535 in non-contiguous dequantize... |
commit | commitdiff | tree |
| 2026-03-15 |
Jayant Lohia | CUDA: add CDNA3 MFMA support for flash attention MMA... |
commit | commitdiff | tree |
| 2026-03-15 |
Aman Gupta | ggml-cpu: add repack for mxfp4 (llama/19738) |
commit | commitdiff | tree |
| 2026-03-15 |
David366AI | examples/yolo: fix load_model memory leak (#1432) |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | gguf : sync (llama/0) |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | scripts : sync gguf code |
commit | commitdiff | tree |
| 2026-02-27 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
| 2026-02-27 |
Neo Zhang | replace the magic nunber 768 by max work group size... |
commit | commitdiff | tree |
| 2026-02-27 |
Vishal Singh | ggml-zendnn: update code for latest ZenDNN API (llama... |
commit | commitdiff | tree |
| 2026-02-27 |
Adrien Gallouët | ggml : fix AMX and add batched support (llama/19925) |
commit | commitdiff | tree |
| 2026-02-27 |
Ruben Ortlam | vulkan: fix fp16 Flash Attention on Windows AMD RDNA2... |
commit | commitdiff | tree |
| 2026-02-27 |
Kevin Pouget | ggml-virtgpu: improve the reliability of the code ... |
commit | commitdiff | tree |
| 2026-02-27 |
Neo Zhang | support permuted, remove check s0/s10 (llama/19889) |
commit | commitdiff | tree |
| 2026-02-27 |
Jeff Bolz | vulkan: check for memory overlap before doing fusion... |
commit | commitdiff | tree |
| 2026-02-25 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
| 2026-02-25 |
Georgi Gerganov | ggml/gguf : prevent integer overflows (llama/19856) |
commit | commitdiff | tree |
| 2026-02-25 |
Ruben Ortlam | Vulkan Scalar Flash Attention Refactor (llama/19625) |
commit | commitdiff | tree |
| 2026-02-25 |
Jeff Bolz | vulkan: fix coopmat1 without bf16 support (llama/19793) |
commit | commitdiff | tree |
| 2026-02-25 |
Jeff Bolz | vulkan: fix data race in mul_mat_id shader (llama/19790) |
commit | commitdiff | tree |
| 2026-02-25 |
Max Krasnyansky | hexagon refactor all Ops to use local context struct... |
commit | commitdiff | tree |
| 2026-02-25 |
Alberto Cabrera... | ggml-cpu: arm64: q5_K repack gemm and gemv (and generic... |
commit | commitdiff | tree |
| 2026-02-25 |
Gaurav Garg | Improve CUDA graph capture (llama/19754) |
commit | commitdiff | tree |
| 2026-02-25 |
Taimur Ahmad | ggml-cpu: add RVV vec dot kernels for quantization... |
commit | commitdiff | tree |
| 2026-02-25 |
Jeff Bolz | test: mul_mat tests with huge batch size (llama/19519) |
commit | commitdiff | tree |
| 2026-02-25 |
Masashi Yoshimura | ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support... |
commit | commitdiff | tree |
| 2026-02-25 |
Ruben Ortlam | vulkan: fix MMQ shader push constants and multi-dispatc... |
commit | commitdiff | tree |
| 2026-02-25 |
Johannes Gäßler | CUDA: fix kernel selection logic for tile FA (llama... |
commit | commitdiff | tree |
| 2026-02-25 |
shalinib-ibm | llamafile: powerpc: add FP16 MMA path for Q4/Q8 matmul... |
commit | commitdiff | tree |
| 2026-02-25 |
Reese Levine | ggml webgpu: Fix bug in dispatching large matrix-vector... |
commit | commitdiff | tree |
| 2026-02-25 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
| 2026-02-25 |
Reese Levine | ggml webgpu: shader library organization (llama/19530) |
commit | commitdiff | tree |
| 2026-02-25 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
| 2026-02-25 |
Jeff Bolz | vulkan: split mul_mat into multiple dispatches to avoid... |
commit | commitdiff | tree |
| 2026-02-25 |
shaofeiqi | opencl: refactor expm1 and softplus (llama/19404) |
commit | commitdiff | tree |
| 2026-02-25 |
shaofeiqi | opencl: optimize mean and sum_row kernels (llama/19614) |
commit | commitdiff | tree |
| 2026-02-25 |
Talha Can Havadar | ggml: ggml-cpu: force-no-lto-for-cpu-feats (llama/19609) |
commit | commitdiff | tree |
| 2026-02-25 |
Georgi Gerganov | cuda : enable CUDA graphs for MMID 1 <= BS <= 4 (llama... |
commit | commitdiff | tree |
| 2026-02-25 |
Judd | ggml : make `ggml_is_view` as API (llama/19539) |
commit | commitdiff | tree |
| 2026-02-25 |
Mario Limonciello | Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer... |
commit | commitdiff | tree |
| 2026-02-25 |
abhijain1204fujitsu | ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k... |
commit | commitdiff | tree |
| 2026-02-25 |
David Friehs | cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization ... |
commit | commitdiff | tree |
| 2026-02-25 |
Daniel Bevenius | cmake : check if KleidiAI API has been fetched (llama... |
commit | commitdiff | tree |
| 2026-02-25 |
Georgi Gerganov | ggml : avoid UB in gemm ukernel (llama/19642) |
commit | commitdiff | tree |
| 2026-02-25 |
Aaron Teo | ggml-cpu: optimize ggml_vec_dot_bf16 for s390x (llama... |
commit | commitdiff | tree |
| 2026-02-25 |
Aman Gupta | ggml-cpu: FA add GEMM microkernel (llama/19422) |
commit | commitdiff | tree |
| 2026-02-25 |
SamareshSingh | cmake : fix KleidiAI install target failure with EXCLUD... |
commit | commitdiff | tree |
| 2026-02-25 |
Salman Chishti | ci : Upgrade GitHub Actions for Node 24 compatibility... |
commit | commitdiff | tree |
| 2026-02-15 |
Georgi Gerganov | ggml : bump version to 0.9.7 (#1425) upstream/0.9.7 v0.9.7 |
commit | commitdiff | tree |
| 2026-02-15 |
Georgi Gerganov | sync : whisper.cpp |
commit | commitdiff | tree |
| 2026-02-14 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
| 2026-02-14 |
Georgi Gerganov | models : optimize qwen3next graph (llama/19375) |
commit | commitdiff | tree |
| 2026-02-14 |
Adrien Gallouët | ggml : fix GGML_DEBUG with OpenMP (llama/19599) |
commit | commitdiff | tree |
| 2026-02-14 |
Georgi Gerganov | metal : fix ACC op (llama/19427) |
commit | commitdiff | tree |
| 2026-02-14 |
Jeff Bolz | vulkan: support L2_NORM with contiguous rows (llama... |
commit | commitdiff | tree |
| 2026-02-14 |
Jeff Bolz | vulkan: support GGML_OP_SET (llama/19584) |
commit | commitdiff | tree |
| 2026-02-14 |
Sophon | vulkan: Add vendor id for Qualcomm drivers (llama/19569) |
commit | commitdiff | tree |
| 2026-02-14 |
Max Krasnyansky | hexagon: further optimizations and refactoring for... |
commit | commitdiff | tree |
| 2026-02-14 |
Jeff Bolz | vulkan: restore -inf check in FA shaders (llama/19582) |
commit | commitdiff | tree |
| 2026-02-14 |
Alberto Cabrera... | Fix wrong memcpy length for block_interleave == 4 ... |
commit | commitdiff | tree |
| 2026-02-14 |
ymcki | fix vulkan ggml_acc only works in 3d but not 4d (llama... |
commit | commitdiff | tree |
| 2026-02-14 |
Aman Gupta | CUDA: loop over ne2*ne3 in case it overflows (llama... |
commit | commitdiff | tree |
| 2026-02-14 |
Oliver Simons | CUDA: Do not mutate cgraph for fused ADDs (llama/19566) |
commit | commitdiff | tree |
| 2026-02-14 |
Georgi Gerganov | metal : improve concurrency (llama/19555) |
commit | commitdiff | tree |
| 2026-02-14 |
Georgi Gerganov | metal : support GGML_OP_SET (llama/19548) |
commit | commitdiff | tree |
| 2026-02-14 |
Shupei Fan | hexagon: fix typo in vtcm_needs_release (llama/19545) |
commit | commitdiff | tree |
| next |