| 2025-12-11 |
Jeff Bolz | vulkan: enable mmvq for q2_k on NVIDIA (llama/17675) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: set all memory allocations to high priority... |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | rpc : fix alloc size logic (llama/17116) |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | metal : add residency sets keep-alive heartbeat (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Johannes Gäßler | HIP : fix RDNA4 build (llama/17792) |
commit | commitdiff | tree |
| 2025-12-11 |
shalinib-ibm | Q4/Q8 Tiled Gemm Optimization. (llama/16999) |
commit | commitdiff | tree |
| 2025-12-11 |
Johannes Gäßler | CUDA: fix FA VKQ accumulator overflow (llama/17746) |
commit | commitdiff | tree |
| 2025-12-11 |
Jiacheng (Jason... | HIP: enable WMMA-MMQ INT kernels for RDNA 3 (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | Add support for CUMSUM and TRI for CUDA. (llama/17584) |
commit | commitdiff | tree |
| 2025-12-11 |
Gabe Goodhart | metal: TRI, FILL, EXPM1, SOFTPLUS (llama/16623) |
commit | commitdiff | tree |
| 2025-12-11 |
Alberto Cabrera... | ggml-cpu : remove asserts always evaluating to false... |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | metal : use params per pipeline instance (llama/17739) |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | build : move _WIN32_WINNT definition to headers (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Herman Semenoff | ggml-cpu: remove duplicate conditional check 'iid'... |
commit | commitdiff | tree |
| 2025-12-11 |
Johannes Gäßler | CUDA: generalized (mma) FA, add Volta support (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | metal : fix data race in pipeline library (llama/17731) |
commit | commitdiff | tree |
| 2025-12-11 |
Reese Levine | ggml webgpu: add support for emscripten builds (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Reduce temporary memory usage for TOP_K (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
xiaobing318 | cmake : add utf8 compilation options for msvc (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : use svcntb() for SVE vector length detection... |
commit | commitdiff | tree |
| 2025-12-11 |
TianHao324 | CANN: Disable Ger operator of OUT_PROD on 310p device... |
commit | commitdiff | tree |
| 2025-12-11 |
Daniel Bevenius | ggml : remove redundant n_copies check when setting... |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : add fallback definition for HWCAP2_SVE2 (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Aman Gupta | ggml-cuda: reorder only relevant nodes (llama/17639) |
commit | commitdiff | tree |
| 2025-12-11 |
Neo Zhang Jianyu | enhance argsort for UT (llama/17573) |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | metal : add FA head size 48 (llama/17619) |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | ggml : extend the GGML_SCHED_NO_REALLOC debug logic... |
commit | commitdiff | tree |
| 2025-12-11 |
Aman Gupta | llama-graph: avoid expand_forward for fusion (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Tarek Dakhran | model: LFM2-VL fixes (llama/17577) |
commit | commitdiff | tree |
| 2025-12-11 |
Gilad S. | ggml: fix: macOS build with `-DGGML_BACKEND_DL=ON`... |
commit | commitdiff | tree |
| 2025-12-11 |
Aman Gupta | CUDA: add stream-based concurrency (llama/16991) |
commit | commitdiff | tree |
| 2025-12-11 |
Mahekk Shaikh | cuda : add error checking for cudaMemcpyAsync in argsor... |
commit | commitdiff | tree |
| 2025-12-11 |
Acly | vulkan : fix FA mask load with bounds check (coopmat2... |
commit | commitdiff | tree |
| 2025-12-11 |
Neo Zhang | sycl : support to malloc memory on device more than... |
commit | commitdiff | tree |
| 2025-12-11 |
ixgbe | ggml: replace hwcap with riscv_hwprobe for RVV detectio... |
commit | commitdiff | tree |
| 2025-12-11 |
Ruben Ortlam | Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: improve topk perf for large k, fix overflow... |
commit | commitdiff | tree |
| 2025-12-11 |
Diego Devesa | ggml : add GGML_SCHED_NO_REALLOC option to disable... |
commit | commitdiff | tree |
| 2025-12-11 |
R0CKSTAR | enable fp16/fast_fp16/bf16_mma on PH1 (llama/17551) |
commit | commitdiff | tree |
| 2025-12-11 |
Aman Gupta | ggml-cuda: add stricter checking for fusion (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | model : Qwen3 Next (llama/16095) |
commit | commitdiff | tree |
| 2025-12-11 |
Johannes Gäßler | CUDA: no FP16 arithmetic for vector FA kernel (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Implement GGML_OP_TRI (llama/17503) |
commit | commitdiff | tree |
| 2025-12-11 |
Radoslav Gerganov | rpc : cache and reuse compute graphs (llama/15405) |
commit | commitdiff | tree |
| 2025-12-11 |
yulo | HIP: enable mul_mat_f for RDNA4 (llama/17437) |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | SOLVE_TRI CUDA kernel for small matrices (llama/17457) |
commit | commitdiff | tree |
| 2025-12-11 |
Neo Zhang Jianyu | refactor pad_reflect_1d to make the UT case pass (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Implement SOLVE_TRI (llama/17486) |
commit | commitdiff | tree |
| 2025-12-11 |
matt23654 | cuda : fix UMA detection on discrete GPUs. (llama/17537) |
commit | commitdiff | tree |
| 2025-12-11 |
Alberto Cabrera... | ggml-cpu: aarm64: q4_K repack gemm and gemv implementat... |
commit | commitdiff | tree |
| 2025-12-11 |
Acly | vulkan : move contiguous checks to device_supports_op... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: use a fixed 1KB buffer for the add_rms_fusion... |
commit | commitdiff | tree |
| 2025-12-11 |
lhez | opencl: add sqr, sqrt, mean and ssm_conv (llama/17476) |
commit | commitdiff | tree |
| 2025-12-11 |
Alberto Cabrera... | Fix chunks being too small with small matrix sizes... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: allow graph_optimize for prompt processing... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Implement top-k (llama/17418) |
commit | commitdiff | tree |
| 2025-12-11 |
xctan | ggml-cpu : add RISC-V Zvfh impl for ggml_vec_mad_f16... |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : fix ARM feature verification (llama/17519) |
commit | commitdiff | tree |
| 2025-12-11 |
Jiacheng (Jason... | HIP: Patch failed testcase in WMMA-MMQ kernels for... |
commit | commitdiff | tree |
| 2025-12-11 |
hipudding | CANN: Add MROPE and IMROPE support (llama/17401) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Implement GGML_OP_CUMSUM (llama/17479) |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | ggml : add ggml_top_k (llama/17365) |
commit | commitdiff | tree |
| 2025-12-11 |
TianHao324 | CANN: supports out_prod operator for F32 and F16 (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Use fewer rows for scalar FA when HS is not... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: more FA details in vk_perf_logger (llama/17443) |
commit | commitdiff | tree |
| 2025-12-11 |
Jiacheng (Jason... | HIP: WMMA-MMQ kernels for RDNA 4 (llama/17156) |
commit | commitdiff | tree |
| 2025-12-11 |
Alberto Cabrera... | ggml-cpu: arm64: q4_K repack gemm and gemv implementati... |
commit | commitdiff | tree |
| 2025-12-11 |
ixgbe | ggml: add RISC-V cpu-feats (llama/17461) |
commit | commitdiff | tree |
| 2025-12-11 |
Max Krasnyansky | hexagon: add support for ROPE_NEOX (llama/17458) |
commit | commitdiff | tree |
| 2025-12-11 |
Raul Torres | CANN: Define `cann_graph_update_required` before macro... |
commit | commitdiff | tree |
| 2025-12-11 |
M. Mediouni | ggml-hexagon: Initial Hexagon v68/v69 support (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
nullname | ggml-hexagon: add `hex_supported_buffer` for better... |
commit | commitdiff | tree |
| 2025-12-11 |
Sigbjørn Skjæret | cuda : support non-contiguous i32 to i32 copy (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: remove a couple unnecessary switches (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Masato Nakasaka | Revive MUL_MAT_ID to perf testing (llama/17397) |
commit | commitdiff | tree |
| 2025-12-11 |
yulo | HIP: RDNA4 tensor core support for MMF (llama/17077) |
commit | commitdiff | tree |
| 2025-12-11 |
lhez | opencl: refine condition for kqv mm (llama/17392) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: disable async for older Intel devices (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Raul Torres | CANN: Refactor `evaluate_and_capture_cann_graph` (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
nullname | ggml-hexagon: fix swiglu failure at `test-backend-ops... |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | ggml : Fix transposed SOLVE_TRI result (llama/17323) |
commit | commitdiff | tree |
| 2025-12-11 |
Scott Fudally | DGX Spark: UMA support (llama/17368) |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : remove useless and error-prone variadic macros... |
commit | commitdiff | tree |
| 2025-12-11 |
sudhiarm | kleidiai: fix zero-size array declaration (llama/17240) |
commit | commitdiff | tree |
| 2025-12-11 |
ixgbe | ggml-cpu:add RISC-V RVV (Zvfh) optimization for FP16... |
commit | commitdiff | tree |
| 2025-12-11 |
Giuseppe Scrivano | vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: support larger argsort (llama/17313) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Add copy_transpose shader (llama/17371) |
commit | commitdiff | tree |
| 2025-12-11 |
Aman Gupta | cuda: fix rope fusion for gemma3 (llama/17378) |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | Fix too relaxed check on CUDA "fast copy" (can_be_trans... |
commit | commitdiff | tree |
| 2025-12-11 |
Ruben Ortlam | vulkan: force full subgroups for flash attention to... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeremy Rand | ggml-cpu: Don't pass -mpowerpc64 when -mcpu already... |
commit | commitdiff | tree |
| 2025-12-11 |
Chenguang Li | CANN: fix acl_tensor_ptr usage in ASCEND_310P ROPE... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: support noncontig i32 copy (llama/17328) |
commit | commitdiff | tree |
| 2025-12-11 |
Ruben Ortlam | vulkan: add log RTE support to fix Nvidia CI (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | cmake : fix ARM feature verification (llama/17170) |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : add missing AVX512 feature checks (llama/17270) |
commit | commitdiff | tree |
| 2025-11-24 |
Daniel Bevenius | ggml : remove dirty flag from version string (#1391) |
commit | commitdiff | tree |
| 2025-11-20 |
Georgi Gerganov | sync : whisper.cpp |
commit | commitdiff | tree |
| 2025-11-20 |
YangLe | metal : fix compile on macos 11 (whisper/3533) |
commit | commitdiff | tree |
| next |