| 2025-12-11 |
Acly | vulkan : move contiguous checks to device_supports_op... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: use a fixed 1KB buffer for the add_rms_fusion... |
commit | commitdiff | tree |
| 2025-12-11 |
lhez | opencl: add sqr, sqrt, mean and ssm_conv (llama/17476) |
commit | commitdiff | tree |
| 2025-12-11 |
Alberto Cabrera... | Fix chunks being too small with small matrix sizes... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: allow graph_optimize for prompt processing... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Implement top-k (llama/17418) |
commit | commitdiff | tree |
| 2025-12-11 |
xctan | ggml-cpu : add RISC-V Zvfh impl for ggml_vec_mad_f16... |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : fix ARM feature verification (llama/17519) |
commit | commitdiff | tree |
| 2025-12-11 |
Jiacheng (Jason... | HIP: Patch failed testcase in WMMA-MMQ kernels for... |
commit | commitdiff | tree |
| 2025-12-11 |
hipudding | CANN: Add MROPE and IMROPE support (llama/17401) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Implement GGML_OP_CUMSUM (llama/17479) |
commit | commitdiff | tree |
| 2025-12-11 |
Georgi Gerganov | ggml : add ggml_top_k (llama/17365) |
commit | commitdiff | tree |
| 2025-12-11 |
TianHao324 | CANN: supports out_prod operator for F32 and F16 (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Use fewer rows for scalar FA when HS is not... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: more FA details in vk_perf_logger (llama/17443) |
commit | commitdiff | tree |
| 2025-12-11 |
Jiacheng (Jason... | HIP: WMMA-MMQ kernels for RDNA 4 (llama/17156) |
commit | commitdiff | tree |
| 2025-12-11 |
Alberto Cabrera... | ggml-cpu: arm64: q4_K repack gemm and gemv implementati... |
commit | commitdiff | tree |
| 2025-12-11 |
ixgbe | ggml: add RISC-V cpu-feats (llama/17461) |
commit | commitdiff | tree |
| 2025-12-11 |
Max Krasnyansky | hexagon: add support for ROPE_NEOX (llama/17458) |
commit | commitdiff | tree |
| 2025-12-11 |
Raul Torres | CANN: Define `cann_graph_update_required` before macro... |
commit | commitdiff | tree |
| 2025-12-11 |
M. Mediouni | ggml-hexagon: Initial Hexagon v68/v69 support (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
nullname | ggml-hexagon: add `hex_supported_buffer` for better... |
commit | commitdiff | tree |
| 2025-12-11 |
Sigbjørn Skjæret | cuda : support non-contiguous i32 to i32 copy (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: remove a couple unnecessary switches (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Masato Nakasaka | Revive MUL_MAT_ID to perf testing (llama/17397) |
commit | commitdiff | tree |
| 2025-12-11 |
yulo | HIP: RDNA4 tensor core support for MMF (llama/17077) |
commit | commitdiff | tree |
| 2025-12-11 |
lhez | opencl: refine condition for kqv mm (llama/17392) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: disable async for older Intel devices (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Raul Torres | CANN: Refactor `evaluate_and_capture_cann_graph` (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
nullname | ggml-hexagon: fix swiglu failure at `test-backend-ops... |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | ggml : Fix transposed SOLVE_TRI result (llama/17323) |
commit | commitdiff | tree |
| 2025-12-11 |
Scott Fudally | DGX Spark: UMA support (llama/17368) |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : remove useless and error-prone variadic macros... |
commit | commitdiff | tree |
| 2025-12-11 |
sudhiarm | kleidiai: fix zero-size array declaration (llama/17240) |
commit | commitdiff | tree |
| 2025-12-11 |
ixgbe | ggml-cpu:add RISC-V RVV (Zvfh) optimization for FP16... |
commit | commitdiff | tree |
| 2025-12-11 |
Giuseppe Scrivano | vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: support larger argsort (llama/17313) |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: Add copy_transpose shader (llama/17371) |
commit | commitdiff | tree |
| 2025-12-11 |
Aman Gupta | cuda: fix rope fusion for gemma3 (llama/17378) |
commit | commitdiff | tree |
| 2025-12-11 |
Piotr Wilkin... | Fix too relaxed check on CUDA "fast copy" (can_be_trans... |
commit | commitdiff | tree |
| 2025-12-11 |
Ruben Ortlam | vulkan: force full subgroups for flash attention to... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeremy Rand | ggml-cpu: Don't pass -mpowerpc64 when -mcpu already... |
commit | commitdiff | tree |
| 2025-12-11 |
Chenguang Li | CANN: fix acl_tensor_ptr usage in ASCEND_310P ROPE... |
commit | commitdiff | tree |
| 2025-12-11 |
Jeff Bolz | vulkan: support noncontig i32 copy (llama/17328) |
commit | commitdiff | tree |
| 2025-12-11 |
Ruben Ortlam | vulkan: add log RTE support to fix Nvidia CI (llama... |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | cmake : fix ARM feature verification (llama/17170) |
commit | commitdiff | tree |
| 2025-12-11 |
Adrien Gallouët | ggml : add missing AVX512 feature checks (llama/17270) |
commit | commitdiff | tree |
| 2025-11-24 |
Daniel Bevenius | ggml : remove dirty flag from version string (#1391) |
commit | commitdiff | tree |
| 2025-11-20 |
Georgi Gerganov | sync : whisper.cpp |
commit | commitdiff | tree |
| 2025-11-20 |
YangLe | metal : fix compile on macos 11 (whisper/3533) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : support I32 -> I32 copy (llama/17317) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : faster argsort (llama/17315) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : add cumsum (llama/17305) |
commit | commitdiff | tree |
| 2025-11-17 |
hipudding | CANN: Use smart pointers to manage ACL objects (llama... |
commit | commitdiff | tree |
| 2025-11-17 |
Pavels Zaicenkovs | vulkan: add LOG operation support for F32 and F16 ... |
commit | commitdiff | tree |
| 2025-11-17 |
Ruben Ortlam | vulkan: fix MMQ quantize_y condition (llama/17301) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : remove obosolete asserts (llama/17295) |
commit | commitdiff | tree |
| 2025-11-17 |
lhez | opencl: fix rms_norm_mul (llama/17250) |
commit | commitdiff | tree |
| 2025-11-17 |
shaofeiqi | opencl: add kernel to handle mat mul in attention to... |
commit | commitdiff | tree |
| 2025-11-17 |
shani-f | sycl : unify unary kernels with a generic implementatio... |
commit | commitdiff | tree |
| 2025-11-17 |
Jeff Bolz | vulkan: Fuse mul_mat_id+add_id+mul and mul_mat+add... |
commit | commitdiff | tree |
| 2025-11-17 |
Ruben Ortlam | vulkan: Replace 16-bit unpack8 calls to work around... |
commit | commitdiff | tree |
| 2025-11-17 |
Giuseppe Scrivano | vulkan: implement ABS and NEG (llama/17245) |
commit | commitdiff | tree |
| 2025-11-17 |
Jeff Bolz | vulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec... |
commit | commitdiff | tree |
| 2025-11-17 |
Jeff Bolz | vulkan: skip all-negative-inf blocks in FA (llama/17186) |
commit | commitdiff | tree |
| 2025-11-17 |
Jeff Bolz | vulkan: change graph_compute to be async and enable... |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : support argsort for ne00 > 1024 (llama/17247) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : make the FA extra sizes consistent (llama/17143) |
commit | commitdiff | tree |
| 2025-11-17 |
Alberto Cabrera... | ggml-cpu: handle 3d tensors in repack mat_mul (llama... |
commit | commitdiff | tree |
| 2025-11-17 |
Piotr Wilkin... | ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM... |
commit | commitdiff | tree |
| 2025-11-17 |
Ruben Ortlam | vulkan: remove shell call from vulkan-shaders-gen tool... |
commit | commitdiff | tree |
| 2025-11-17 |
Diego Devesa | sched : fix reserve ignoring user tensor assignments... |
commit | commitdiff | tree |
| 2025-11-17 |
ixgbe | ggml-cpu : add RISC-V vector intrinsic support for... |
commit | commitdiff | tree |
| 2025-11-17 |
bagheera | metal: accelerated conv2d (llama/17175) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | Revert "ggml-cpu: handle 3d tensors in repack mat_mul... |
commit | commitdiff | tree |
| 2025-11-17 |
Diego Devesa | ggml-cpu : use template for argsort (llama/17222) |
commit | commitdiff | tree |
| 2025-11-17 |
TecJesh | CANN: Add cross_entropy_loss op support (llama/16886) |
commit | commitdiff | tree |
| 2025-11-17 |
Aman Gupta | CUDA: fuse rope + set_rows (llama/16884) |
commit | commitdiff | tree |
| 2025-11-17 |
Johannes Gäßler | CUDA: static assert to prevent misuse of memcpy_1 ... |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | ggml : use std::sort in ggml_argsort CPU implementation... |
commit | commitdiff | tree |
| 2025-11-17 |
Alberto Cabrera... | ggml-cpu: handle 3d tensors in repack mat_mul (llama... |
commit | commitdiff | tree |
| 2025-11-17 |
TecJesh | CANN: Add L2_NORM op support (llama/16856) |
commit | commitdiff | tree |
| 2025-11-17 |
Neo Zhang Jianyu | fix ci crash about SSM_CONV (llama/17169) |
commit | commitdiff | tree |
| 2025-11-17 |
Max Krasnyansky | hexagon: various Op fixes (llama/17135) |
commit | commitdiff | tree |
| 2025-11-17 |
Eve | disable rms norm mul rope for chips with no fp16 rte... |
commit | commitdiff | tree |
| 2025-11-17 |
ixgbe | ggml-cpu : add RISC-V RVV (Zvfh) optimization for FP16... |
commit | commitdiff | tree |
| 2025-11-17 |
duduta | ggml-cpu: templateify ggml_compute_forward_rope_f32... |
commit | commitdiff | tree |
| 2025-11-17 |
Charles Xu | kleidiai: add optimized per-channel kernels for Q8_0... |
commit | commitdiff | tree |
| 2025-11-17 |
Mike Abbott | cmake : add version to all shared object files (llama... |
commit | commitdiff | tree |
| 2025-11-17 |
lhez | opencl: add fastdiv and use it in set_rows, ported... |
commit | commitdiff | tree |
| 2025-11-17 |
Max Krasnyansky | cpu: skip NOPs to avoid barriers (llama/17133) |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : cap threadgroups size of set_rows (llama/17146) |
commit | commitdiff | tree |
| 2025-11-17 |
Adrien Gallouët | ggml-cpu : inspect -march and -mcpu to found the CPU... |
commit | commitdiff | tree |
| 2025-11-17 |
Ruben Ortlam | vulkan: check glslc executable string (llama/17144) |
commit | commitdiff | tree |
| 2025-11-17 |
Ruben Ortlam | vulkan: fix validation issue introduced by #16868 ... |
commit | commitdiff | tree |
| 2025-11-17 |
Georgi Gerganov | metal : enable tensor API for A19 (llama/17087) |
commit | commitdiff | tree |
| 2025-11-17 |
fj-y-saito | arm64: add i8mm route with SVE ggml_vec_dot_q4_K_q8_K... |
commit | commitdiff | tree |
| 2025-11-17 |
Acly | cuda/vulkan : bicubic interpolation (llama/17022) |
commit | commitdiff | tree |
| 2025-11-17 |
Ruben Ortlam | vulkan: fix memory allocations (llama/17122) |
commit | commitdiff | tree |
| next |