| 2025-12-18 |
Jeff Bolz | vulkan: Multi-pass softmax for large number of cols... |
commit | commitdiff | tree |
| 2025-12-18 |
Jeff Bolz | vulkan: Allow non-pow2 n_experts in topk_moe (llama... |
commit | commitdiff | tree |
| 2025-12-18 |
Johannes Gäßler | CUDA: fix overflow in MMA kernel without stream-k ... |
commit | commitdiff | tree |
| 2025-12-18 |
Sigbjørn Skjæret | cann : fix ops broken by circular padding guard (llama... |
commit | commitdiff | tree |
| 2025-12-18 |
ixgbe | ggml-cpu : fix RISC-V Q4_0 repack select and RVV featur... |
commit | commitdiff | tree |
| 2025-12-18 |
yulo | HIP: enable mmf for RDNA3 (llama/17879) |
commit | commitdiff | tree |
| 2025-12-18 |
Piotr Wilkin... | SOLVE_TRI extension to more dimensions (llama/17793) |
commit | commitdiff | tree |
| 2025-12-17 |
Russ | build: link whisper target against Threads::Threads... |
commit | commitdiff | tree |
| 2025-12-13 |
Marcos Del... | server: allow custom temp directory for ffmpeg (#3564) |
commit | commitdiff | tree |
| 2025-12-13 |
Georgi Gerganov | ggml : arm repack fix build (#0) |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | whisper : adjust to ggml changes (#0) |
commit | commitdiff | tree |
| 2025-12-12 |
Congcong Cai | cmake : set `CMAKE_RUNTIME_OUTPUT_DIRECTORY` for non... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | ggml-alloc : fix reuse-parent logic for misaligned... |
commit | commitdiff | tree |
| 2025-12-12 |
nullname | ggml-hexagon: fix `rope` failure at `test-backend-ops... |
commit | commitdiff | tree |
| 2025-12-12 |
Max Krasnyansky | Fix race conditions in threadpool when dealing with... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | ggml : remove GGML_KQ_MASK_PAD constant (llama/17910) |
commit | commitdiff | tree |
| 2025-12-12 |
Sigbjørn Skjæret | cuda : add missing support check for xielu (llama/17895) |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | CUDA: fix unpadded strides in MMA FA kernel (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Neo Zhang Jianyu | fix softmax for iGPU (llama/17838) |
commit | commitdiff | tree |
| 2025-12-12 |
Gabe Goodhart | metal: SSM kernel improvements (llama/17876) |
commit | commitdiff | tree |
| 2025-12-12 |
Piotr Wilkin... | Add DIAG for CUDA (llama/17873) |
commit | commitdiff | tree |
| 2025-12-12 |
Gabe Goodhart | ggml : Provide macos-specific backtrace printing to... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | metal : print node names for debugging (llama/17882) |
commit | commitdiff | tree |
| 2025-12-12 |
Sigbjørn Skjæret | ggml : allow fill node alloc inplace (llama/17870) |
commit | commitdiff | tree |
| 2025-12-12 |
Chenguang Li | CANN: add support for partial RoPE and Vision mode... |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | CUDA: fix FP16 overflow in tile FA kernel (llama/17875) |
commit | commitdiff | tree |
| 2025-12-12 |
Jay Zenith | cuda : add FILL op support (llama/17851) |
commit | commitdiff | tree |
| 2025-12-12 |
wsbagnsv1 | cuda: optimize SOLVE_TRI using registers and FMAF ... |
commit | commitdiff | tree |
| 2025-12-12 |
ixgbe | ggml-cpu: add ggml_thread_cpu_relax with Zihintpause... |
commit | commitdiff | tree |
| 2025-12-12 |
lovedheart | Vulkan: improve mul_mat_vec_iq1_m (llama/16907) |
commit | commitdiff | tree |
| 2025-12-12 |
Law Po Ying | sycl: add missing BF16 conversion support for Intel... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: perf_logger improvements (llama/17672) |
commit | commitdiff | tree |
| 2025-12-12 |
Vishal Singh | ggml-zendnn : add ZenDNN backend for AMD CPUs (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Phylliida Dev | ggml : add circular tiling support to pad, for Vulkan... |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | HIP: fix RDNA3 FP16/BF16 matrix multiplication (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Sky | ggml : improve error handling for search path existence... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: Use one row per workgroup for f32 mmv (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: support solve_tri with larger N/K values (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | metal : fix build(#17799) |
commit | commitdiff | tree |
| 2025-12-12 |
Masato Nakasaka | vulkan: Replace deprecated VK_EXT_validation_features... |
commit | commitdiff | tree |
| 2025-12-12 |
Masato Nakasaka | vulkan: Fix mismatch in TOPK_MOE unit test (llama/17541) |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: add more num_blocks instantiations in rms_norm... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: fix top_k bug when there are ties in the input... |
commit | commitdiff | tree |
| 2025-12-12 |
Acly | vulkan : support conv-2d with large output size (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Reese Levine | ggml webgpu: unary op suppport, code refactoring, ops... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: enable mmvq for q2_k on NVIDIA (llama/17675) |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: set all memory allocations to high priority... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | rpc : fix alloc size logic (llama/17116) |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | metal : add residency sets keep-alive heartbeat (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | HIP : fix RDNA4 build (llama/17792) |
commit | commitdiff | tree |
| 2025-12-12 |
shalinib-ibm | Q4/Q8 Tiled Gemm Optimization. (llama/16999) |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | CUDA: fix FA VKQ accumulator overflow (llama/17746) |
commit | commitdiff | tree |
| 2025-12-12 |
Jiacheng (Jason... | HIP: enable WMMA-MMQ INT kernels for RDNA 3 (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Piotr Wilkin... | Add support for CUMSUM and TRI for CUDA. (llama/17584) |
commit | commitdiff | tree |
| 2025-12-12 |
Gabe Goodhart | metal: TRI, FILL, EXPM1, SOFTPLUS (llama/16623) |
commit | commitdiff | tree |
| 2025-12-12 |
Alberto Cabrera... | ggml-cpu : remove asserts always evaluating to false... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | metal : use params per pipeline instance (llama/17739) |
commit | commitdiff | tree |
| 2025-12-12 |
Adrien Gallouët | build : move _WIN32_WINNT definition to headers (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Herman Semenoff | ggml-cpu: remove duplicate conditional check 'iid'... |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | CUDA: generalized (mma) FA, add Volta support (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | metal : fix data race in pipeline library (llama/17731) |
commit | commitdiff | tree |
| 2025-12-12 |
Reese Levine | ggml webgpu: add support for emscripten builds (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: Reduce temporary memory usage for TOP_K (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
xiaobing318 | cmake : add utf8 compilation options for msvc (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Adrien Gallouët | ggml : use svcntb() for SVE vector length detection... |
commit | commitdiff | tree |
| 2025-12-12 |
TianHao324 | CANN: Disable Ger operator of OUT_PROD on 310p device... |
commit | commitdiff | tree |
| 2025-12-12 |
Daniel Bevenius | ggml : remove redundant n_copies check when setting... |
commit | commitdiff | tree |
| 2025-12-12 |
Adrien Gallouët | ggml : add fallback definition for HWCAP2_SVE2 (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Aman Gupta | ggml-cuda: reorder only relevant nodes (llama/17639) |
commit | commitdiff | tree |
| 2025-12-12 |
Neo Zhang Jianyu | enhance argsort for UT (llama/17573) |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | metal : add FA head size 48 (llama/17619) |
commit | commitdiff | tree |
| 2025-12-12 |
Georgi Gerganov | ggml : extend the GGML_SCHED_NO_REALLOC debug logic... |
commit | commitdiff | tree |
| 2025-12-12 |
Aman Gupta | llama-graph: avoid expand_forward for fusion (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Tarek Dakhran | model: LFM2-VL fixes (llama/17577) |
commit | commitdiff | tree |
| 2025-12-12 |
Gilad S. | ggml: fix: macOS build with `-DGGML_BACKEND_DL=ON`... |
commit | commitdiff | tree |
| 2025-12-12 |
Aman Gupta | CUDA: add stream-based concurrency (llama/16991) |
commit | commitdiff | tree |
| 2025-12-12 |
Mahekk Shaikh | cuda : add error checking for cudaMemcpyAsync in argsor... |
commit | commitdiff | tree |
| 2025-12-12 |
Acly | vulkan : fix FA mask load with bounds check (coopmat2... |
commit | commitdiff | tree |
| 2025-12-12 |
Neo Zhang | sycl : support to malloc memory on device more than... |
commit | commitdiff | tree |
| 2025-12-12 |
ixgbe | ggml: replace hwcap with riscv_hwprobe for RVV detectio... |
commit | commitdiff | tree |
| 2025-12-12 |
Ruben Ortlam | Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: improve topk perf for large k, fix overflow... |
commit | commitdiff | tree |
| 2025-12-12 |
Diego Devesa | ggml : add GGML_SCHED_NO_REALLOC option to disable... |
commit | commitdiff | tree |
| 2025-12-12 |
R0CKSTAR | enable fp16/fast_fp16/bf16_mma on PH1 (llama/17551) |
commit | commitdiff | tree |
| 2025-12-12 |
Aman Gupta | ggml-cuda: add stricter checking for fusion (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Piotr Wilkin... | model : Qwen3 Next (llama/16095) |
commit | commitdiff | tree |
| 2025-12-12 |
Johannes Gäßler | CUDA: no FP16 arithmetic for vector FA kernel (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: Implement GGML_OP_TRI (llama/17503) |
commit | commitdiff | tree |
| 2025-12-12 |
Radoslav Gerganov | rpc : cache and reuse compute graphs (llama/15405) |
commit | commitdiff | tree |
| 2025-12-12 |
yulo | HIP: enable mul_mat_f for RDNA4 (llama/17437) |
commit | commitdiff | tree |
| 2025-12-12 |
Piotr Wilkin... | SOLVE_TRI CUDA kernel for small matrices (llama/17457) |
commit | commitdiff | tree |
| 2025-12-12 |
Neo Zhang Jianyu | refactor pad_reflect_1d to make the UT case pass (llama... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: Implement SOLVE_TRI (llama/17486) |
commit | commitdiff | tree |
| 2025-12-12 |
matt23654 | cuda : fix UMA detection on discrete GPUs. (llama/17537) |
commit | commitdiff | tree |
| 2025-12-12 |
Alberto Cabrera... | ggml-cpu: aarm64: q4_K repack gemm and gemv implementat... |
commit | commitdiff | tree |
| 2025-12-12 |
Acly | vulkan : move contiguous checks to device_supports_op... |
commit | commitdiff | tree |
| 2025-12-12 |
Jeff Bolz | vulkan: use a fixed 1KB buffer for the add_rms_fusion... |
commit | commitdiff | tree |
| 2025-12-12 |
lhez | opencl: add sqr, sqrt, mean and ssm_conv (llama/17476) |
commit | commitdiff | tree |
| next |