2025-08-14 |
Jason Ni | ggml: fix ggml_conv_1d_dw bug (#1323) upstream/0.0.2446 |
commit | commitdiff | tree |
2025-08-14 |
Georgi Gerganov | mnist : adapt to opt changes |
commit | commitdiff | tree |
2025-08-14 |
Georgi Gerganov | tests : remove unused includes (#0) |
commit | commitdiff | tree |
2025-08-14 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
2025-08-14 |
Sigbjørn Skjæret | cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300) |
commit | commitdiff | tree |
2025-08-14 |
Jonathan Graehl | finetune: SGD optimizer, more CLI args (llama/13873) |
commit | commitdiff | tree |
2025-08-14 |
uvos | HIP: bump requirement to rocm 6.1 (llama/15296) |
commit | commitdiff | tree |
2025-08-14 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
2025-08-14 |
Judd | ggml : update `ggml_rope_multi` (llama/12665) |
commit | commitdiff | tree |
2025-08-14 |
Georgi Gerganov | ggml : repack block_iq4_nlx8 (llama/14904) |
commit | commitdiff | tree |
2025-08-14 |
Oliver Simons | CUDA: Optimize `reduce_rows_f32` kernel, leading up... |
commit | commitdiff | tree |
2025-08-14 |
Tak-RS | ggml-rpc: chunk send()/recv() to avoid EINVAL for very... |
commit | commitdiff | tree |
2025-08-14 |
uvos | HIP: disable sync warp shuffel operators from clr amd_w... |
commit | commitdiff | tree |
2025-08-14 |
Romain Biessy | sycl: Fix and disable more configurations of mul_mat... |
commit | commitdiff | tree |
2025-08-14 |
rmatif | opencl: allow mixed f16/f32 `add` (llama/15140) |
commit | commitdiff | tree |
2025-08-14 |
Aman Gupta | CUDA cmake: add `-lineinfo` for easier debug (llama... |
commit | commitdiff | tree |
2025-08-14 |
Chenguang Li | CANN: GGML_OP_CPY optimization (llama/15070) |
commit | commitdiff | tree |
2025-08-14 |
R0CKSTAR | musa: fix failures in test-backend-ops for mul_mat_id... |
commit | commitdiff | tree |
2025-08-14 |
hipudding | CANN: Add broadcast for softmax and FA (llama/15208) |
commit | commitdiff | tree |
2025-08-14 |
Charles Xu | kleidiai: fix unsigned overflow bug (llama/15150) |
commit | commitdiff | tree |
2025-08-14 |
David Zhao | cuda: refactored ssm_scan and use CUB (llama/13291) |
commit | commitdiff | tree |
2025-08-14 |
Aman Gupta | CUDA: add attention sinks for tile and wmma (llama... |
commit | commitdiff | tree |
2025-08-14 |
compilade | gguf-py : add Numpy MXFP4 de/quantization support ... |
commit | commitdiff | tree |
2025-08-14 |
AN Long | ggml : fix field name when new ggml_backend (llama... |
commit | commitdiff | tree |
2025-08-14 |
Johannes Gäßler | CUDA: attention sinks for mma FlashAttention (llama... |
commit | commitdiff | tree |
2025-08-14 |
lhez | opencl: support sink in `soft_max` (attn sinks) (llama... |
commit | commitdiff | tree |
2025-08-14 |
Jeff Bolz | vulkan: support fattn sinks (llama/15126) |
commit | commitdiff | tree |
2025-08-14 |
Jeff Bolz | vulkan: Add env var to disable host visible vidmem... |
commit | commitdiff | tree |
2025-08-14 |
uvos | HIP: add cmake option to enable compiler output of... |
commit | commitdiff | tree |
2025-08-14 |
Christian Kastner | ggml: Skip backend library linking code when GGML_BACKE... |
commit | commitdiff | tree |
2025-08-14 |
Johannes Gäßler | CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama... |
commit | commitdiff | tree |
2025-08-14 |
rmatif | fix profiling crash (llama/15072) |
commit | commitdiff | tree |
2025-08-14 |
lhez | opencl: add `swiglu_oai` and `add_id` (llama/15121) |
commit | commitdiff | tree |
2025-08-14 |
Diego Devesa | ggml : fix fallback to CPU for ununsupported ops (llama... |
commit | commitdiff | tree |
2025-08-14 |
Chenguang Li | CANN: add support for ACL Graph (llama/15065) |
commit | commitdiff | tree |
2025-08-14 |
Georgi Gerganov | llama : add gpt-oss (llama/15091) |
commit | commitdiff | tree |
2025-08-14 |
Romain Biessy | sycl: fix mul_mat selection (llama/15092) |
commit | commitdiff | tree |
2025-08-14 |
Christian Kastner | cmake: Add GGML_BACKEND_DIR option (llama/15074) |
commit | commitdiff | tree |
2025-08-14 |
Jeff Bolz | vulkan: fix build when using glslang that does not... |
commit | commitdiff | tree |
2025-08-14 |
Jeff Bolz | vulkan: Use coopmat2 for conv2d (llama/14982) |
commit | commitdiff | tree |
2025-08-14 |
lhez | opencl: fix adreno compiler detection logic (llama... |
commit | commitdiff | tree |
2025-08-14 |
Johannes Gäßler | CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (llama... |
commit | commitdiff | tree |
2025-08-02 |
Georgi Gerganov | sync : llama.cpp upstream/0.0.2404 |
commit | commitdiff | tree |
2025-08-02 |
leejet | cuda: make im2col a little faster (llama/15025) |
commit | commitdiff | tree |
2025-08-02 |
Georgi Gerganov | cuda, sycl : fix batched gemm when ne02 == 1 && ne03... |
commit | commitdiff | tree |
2025-08-02 |
Jeff Bolz | vulkan: coopmat2 mul_mat optimizations (llama/14934) |
commit | commitdiff | tree |
2025-08-02 |
Jeff Bolz | vulkan: Support ne[3]>1 in noncontig matrix-vector... |
commit | commitdiff | tree |
2025-08-02 |
Jeff Bolz | vulkan: optimizations for direct convolution (llama... |
commit | commitdiff | tree |
2025-08-02 |
Johannes Gäßler | CUDA: fix MMQ nwarps for AMD with warp_size==32 (llama... |
commit | commitdiff | tree |
2025-08-02 |
lhez | opencl: add f16 for `add`, `sub`, `mul`, `div` (llama... |
commit | commitdiff | tree |
2025-08-02 |
Srihari-mcw | ggml : Q2k interleaving implementation - x86/x64 SIMD... |
commit | commitdiff | tree |
2025-08-02 |
diannao | docker : add cann build pipline (llama/14591) |
commit | commitdiff | tree |
2025-08-02 |
Ruben Ortlam | Vulkan: Fix minor debug mode issues (llama/14899) |
commit | commitdiff | tree |
2025-08-02 |
hipudding | CANN: Improve loading efficiency after converting weigh... |
commit | commitdiff | tree |
2025-08-02 |
lhez | opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f3... |
commit | commitdiff | tree |
2025-08-02 |
uvos | HIP: enable mfma mmq on gfx908 and gfx90a for select... |
commit | commitdiff | tree |
2025-08-02 |
Johannes Gäßler | CUDA: skip masked KV slices for all FA kernels (llama... |
commit | commitdiff | tree |
2025-08-02 |
uvos | HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly... |
commit | commitdiff | tree |
2025-08-02 |
uvos | HIP: add GGML_HIP_MMQ_MFMA option to allow disableing... |
commit | commitdiff | tree |
2025-08-02 |
uvos | HIP: Ignore unsupported unroll transformation in fattn... |
commit | commitdiff | tree |
2025-08-02 |
hipudding | CANN: Add ggml_set_rows (llama/14943) |
commit | commitdiff | tree |
2025-08-02 |
Sigbjørn Skjæret | cuda : add softcap fusion (llama/14907) |
commit | commitdiff | tree |
2025-08-02 |
Aman Gupta | CUDA: add roll (llama/14919) |
commit | commitdiff | tree |
2025-08-02 |
Leonard Mosescu | test-backend-ops : extend test case filtering (llama... |
commit | commitdiff | tree |
2025-08-02 |
xctan | ggml-cpu : deduplicate scalar implementations (llama... |
commit | commitdiff | tree |
2025-08-02 |
Akarshan Biswas | SYCL: Add set_rows support for quantized types (llama... |
commit | commitdiff | tree |
2025-08-02 |
Johannes Gäßler | CUDA: fix pointer incrementation in FA (llama/14916) |
commit | commitdiff | tree |
2025-08-02 |
Alberto Cabrera... | sycl: refactor quantization to q8_1 (llama/14815) |
commit | commitdiff | tree |
2025-08-02 |
Kai Pastor | ci : Move msvc to matrix (#1318) |
commit | commitdiff | tree |
2025-08-02 |
AN Long | simple : fix typo (#1319) |
commit | commitdiff | tree |
2025-07-30 |
Georgi Gerganov | sync : whisper.cpp |
commit | commitdiff | tree |
2025-07-30 |
Kai Pastor | cmake : Fix BLAS link interface (#1316) |
commit | commitdiff | tree |
2025-07-30 |
Kai Pastor | vulkan : fix 32-bit builds (#1313) |
commit | commitdiff | tree |
2025-07-28 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
2025-07-28 |
Erik Scholz | vulkan : add fp16 support for the conv_2d kernel (llama... |
commit | commitdiff | tree |
2025-07-28 |
Jeff Bolz | vulkan: skip empty set_rows to avoid invalid API usage... |
commit | commitdiff | tree |
2025-07-28 |
Aman Gupta | Docs: add instructions for adding backends (llama/14889) |
commit | commitdiff | tree |
2025-07-28 |
deepsek | HIP: Enable Matrix cores for MMQ Kernels, Enable stream... |
commit | commitdiff | tree |
2025-07-28 |
hipudding | CANN: Implement GLU ops (llama/14884) |
commit | commitdiff | tree |
2025-07-28 |
R0CKSTAR | musa: fix build warnings (unused variable) (llama/14869) |
commit | commitdiff | tree |
2025-07-28 |
Aaron Teo | ggml-cpu : disable GGML_NNPA by default due to instabil... |
commit | commitdiff | tree |
2025-07-28 |
Gabe Goodhart | metal: SSM_SCAN performance (llama/14743) |
commit | commitdiff | tree |
2025-07-28 |
lhez | opencl: add fused `rms_norm_mul` (llama/14841) |
commit | commitdiff | tree |
2025-07-28 |
Oliver Simons | ggml : remove invalid portPos specifiers from dot files... |
commit | commitdiff | tree |
2025-07-28 |
Chris Rohlf | rpc : check for null buffers in get/set/copy tensor... |
commit | commitdiff | tree |
2025-07-28 |
Diego Devesa | sched : fix multiple evaluations of the same graph... |
commit | commitdiff | tree |
2025-07-28 |
R0CKSTAR | musa: upgrade musa sdk to rc4.2.0 (llama/14498) |
commit | commitdiff | tree |
2025-07-25 |
Georgi Gerganov | contrib : recommend PRs to llama.cpp (#1312) |
commit | commitdiff | tree |
2025-07-24 |
Kai Pastor | cmake : Indent ggml-config.cmake (#1310) |
commit | commitdiff | tree |
2025-07-24 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
2025-07-24 |
Alberto Cabrera... | sycl: fixed semantics of block offset calculation ... |
commit | commitdiff | tree |
2025-07-24 |
Georgi Gerganov | metal : fix fusion across different encoders (llama... |
commit | commitdiff | tree |
2025-07-24 |
Donghyeon Jeong | sycl: fix undefined variable in work group size check... |
commit | commitdiff | tree |
2025-07-24 |
Johannes Gäßler | CUDA: fix overflow in FA, tune performance (llama/14840) |
commit | commitdiff | tree |
2025-07-24 |
Johannes Gäßler | CUDA: fix compilation with GGML_CUDA_F16 (llama/14837) |
commit | commitdiff | tree |
2025-07-24 |
Johannes Gäßler | CUDA: fix quantized KV cache + multiple sequences ... |
commit | commitdiff | tree |
2025-07-24 |
Georgi Gerganov | tests : add non-cont K,V FA tests |
commit | commitdiff | tree |
2025-07-24 |
lixing-star | ggml: fix loongarch quantize_row_q8_1 error (llama... |
commit | commitdiff | tree |
2025-07-24 |
chen fan | CANN: weight format to NZ for Ascend310P3 (llama/14407) |
commit | commitdiff | tree |
2025-07-24 |
Aman Gupta | CUDA: add fused rms norm (llama/14800) |
commit | commitdiff | tree |
next |