| 2025-10-15 |
Johannes Gäßler | CUDA: faster tile FA, add oob checks, more HSs (llama... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | release : v1.8.1 |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | bench : update [no ci] |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | metal : fix mul-mm condition + fix mul-mv permuted... |
commit | commitdiff | tree |
| 2025-10-12 |
Diego Devesa | cuda : avoid initializing unused devices (llama/16510) |
commit | commitdiff | tree |
| 2025-10-12 |
Prajwal B Mehendarkar | cmake : Dont define XOPENSOURCE on AIX (llama/16481) |
commit | commitdiff | tree |
| 2025-10-12 |
duduta | cpu : optimize the ggml NORM operation (llama/15953) |
commit | commitdiff | tree |
| 2025-10-12 |
Chenguang Li | CANN: Improve ACL graph matching (llama/16166) |
commit | commitdiff | tree |
| 2025-10-12 |
Charles Xu | kleidiai: kernel interface refactoring (llama/16460) |
commit | commitdiff | tree |
| 2025-10-12 |
Neo Zhang Jianyu | refactor soft_max, add soft_max_back (llama/16472) |
commit | commitdiff | tree |
| 2025-10-12 |
ai-fonsi | Disable CUDA host buffers on integrated GPUs (llama... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | metal : mark FA blocks (llama/16372) |
commit | commitdiff | tree |
| 2025-10-12 |
Reese Levine | ggml webgpu: profiling, CI updates, reworking of comman... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | metal : add support for non-padded FA KV (llama/16148) |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | tests : add -INF blocks to the KQ mask in the FA tests... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | metal : various optimizations + refactoring (llama... |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | ggml : fix unaligned access in AMX code (llama/16315) |
commit | commitdiff | tree |
| 2025-10-12 |
Daniel Bevenius | ggml-cpu : fix leftover handling in ggml_vec_scale_f32... |
commit | commitdiff | tree |
| 2025-10-12 |
Reese Levine | ggml webgpu: actually add softmax, fix rms_norm offset... |
commit | commitdiff | tree |
| 2025-10-12 |
Eve | vulkan: use a more appropriate amount of threads when... |
commit | commitdiff | tree |
| 2025-10-12 |
Radoslav Gerganov | rpc : check src buffer when copying tensor (llama/16421) |
commit | commitdiff | tree |
| 2025-10-12 |
Radoslav Gerganov | rpc : add support for multiple devices (llama/16276) |
commit | commitdiff | tree |
| 2025-10-12 |
Acly | vulkan : incremental shader builds (llama/16341) |
commit | commitdiff | tree |
| 2025-10-12 |
Georgi Gerganov | metal : fix loop bound in ggml_mem_ranges (llama/16412) |
commit | commitdiff | tree |
| 2025-10-12 |
Acly | ggml : fix graph reallocation with multiple chunks... |
commit | commitdiff | tree |
| 2025-10-12 |
Jeff Bolz | vulkan: Replace uses of maxMemoryAllocationSize and... |
commit | commitdiff | tree |
| 2025-10-12 |
Jeff Bolz | vulkan: Fix FA coopmat1 invalid array indexing (llama... |
commit | commitdiff | tree |
| 2025-10-12 |
Jeff Bolz | vulkan: in flash attention, bounds check against nem1... |
commit | commitdiff | tree |
| 2025-10-12 |
Reese Levine | ggml webgpu: add support for soft_max, optimize rms_nor... |
commit | commitdiff | tree |
| 2025-10-12 |
Piotr Wilkin... | model : Apertus model implementation (llama/15852) |
commit | commitdiff | tree |
| 2025-10-12 |
R0CKSTAR | musa: update compile flags (llama/16265) |
commit | commitdiff | tree |
| 2025-10-12 |
uvos | HIP: Disable ROCWMMA fattn on CDNA when compiled agains... |
commit | commitdiff | tree |
| 2025-10-12 |
Eve | vulkan: make ggml_vk_default_dispatcher support older... |
commit | commitdiff | tree |
| 2025-10-12 |
lhez | opencl: support pad_ext (llama/15888) |
commit | commitdiff | tree |
| 2025-10-12 |
Reese Levine | ggml webgpu: support for rope,div,sub,glu,scale,cont... |
commit | commitdiff | tree |
| 2025-10-12 |
lhez | opencl: support ne3 in get_rows (llama/15866) |
commit | commitdiff | tree |
| 2025-10-11 |
Ruben Ortlam | whisper : Support using devices of type iGPU (#3469) |
commit | commitdiff | tree |
| 2025-10-10 |
Andreas Lubbe | whisper : add support for --carry-initial-prompt (... |
commit | commitdiff | tree |
| 2025-10-10 |
Andreas Lubbe | cli: Fix assignment for vad_min_silence_duration_ms... |
commit | commitdiff | tree |
| 2025-10-10 |
Georgi Gerganov | minor : fix code style (#3463) |
commit | commitdiff | tree |
| 2025-10-10 |
Silviu Caragea | vad : free vad_segments in whisper_vad (#3463) |
commit | commitdiff | tree |
| 2025-10-09 |
Georgi Gerganov | whisper : clean-up headers |
commit | commitdiff | tree |
| 2025-10-08 |
KITAITI Makoto | [skip ci]Bump Ruby bindings' version to 1.3.4 (#3461) |
commit | commitdiff | tree |
| 2025-10-06 |
Daniel Bevenius | vad : fix memory leaks in VAD implementation (#3453) |
commit | commitdiff | tree |
| 2025-10-01 |
KITAITI Makoto | ruby : Loose RegExp for test (#3448) |
commit | commitdiff | tree |
| 2025-10-01 |
Daniel Bevenius | bindings-java : disable flash attention by default... |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | bench : update [no ci] |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | scripts : add -nfa option [no ci] |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | wchess : fix link [no ci] |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | release : v1.8.0 upstream/1.8.0 |
commit | commitdiff | tree |
| 2025-09-30 |
Daniel Bevenius | examples : add wchess.wasm to wasm examples build ... |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | whisper : enable flash attention by default (#3441) |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | bench : add rtx 5090 [no ci] |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | ggml : bump version to 0.9.4 (ggml/1363) |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | bench : update [no ci] |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2025-09-30 |
anavp-nvidia | cuda : Enable CUDA Graph usage for Nemotron Nano v2... |
commit | commitdiff | tree |
| 2025-09-30 |
Georgi Gerganov | metal : dynamic simdgroups for MV kernels (llama/16340) |
commit | commitdiff | tree |
| 2025-09-30 |
Charles Xu | kleidiai : fix work size and threads sync for fp16... |
commit | commitdiff | tree |
| 2025-09-30 |
alex-spacemit | ggml: riscv: add riscv spacemit backend (llama/15288) |
commit | commitdiff | tree |
| 2025-09-30 |
Rafal Lewczuk | ggml-backend : add root cause in error message if loadi... |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | bench : update [no ci] (#3439) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | bench : warm-up all kernels (#3438) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | ggml : remove oboslete files (#0) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | ci : add self-hosted workflows (#3437) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | whisper : remove ggml_mul_mat padding (#3436) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | cmake : remove metal flag (llama/0) |
commit | commitdiff | tree |
| 2025-09-29 |
Sigbjørn Skjæret | ggml : check cuda and metal argsort limits and add... |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | ggml : fix dependencies for ggml_set_rows (llama/16318) |
commit | commitdiff | tree |
| 2025-09-29 |
Jeff Bolz | vulkan: Fix validation failure in quantized flash atten... |
commit | commitdiff | tree |
| 2025-09-29 |
Sigbjørn Skjæret | ggml : fix GGML_F32_VEC_FMA argument order in ggml_vec_... |
commit | commitdiff | tree |
| 2025-09-29 |
Jeff Bolz | vulkan: 64-bit im2col (llama/16135) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | metal : extend mat-mat multiplication support (llama... |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | metal : fuse non-sequential nodes (llama/16102) |
commit | commitdiff | tree |
| 2025-09-29 |
Jeff Bolz | vulkan: handle mat_mul with A matrix > 4GB (llama/16176) |
commit | commitdiff | tree |
| 2025-09-29 |
Jeff Bolz | vulkan: support arbitrary KV dimension in flash attenti... |
commit | commitdiff | tree |
| 2025-09-29 |
Acly | vulkan : make the vulkan.hpp dynamic dispatcher instanc... |
commit | commitdiff | tree |
| 2025-09-29 |
Aman Gupta | CUDA: mul_mat_id for mmf for bs <= 64 for f16 and bs... |
commit | commitdiff | tree |
| 2025-09-29 |
Johannes Gäßler | CUDA: refactor and deduplicate vector FA kernels (llama... |
commit | commitdiff | tree |
| 2025-09-29 |
Dmytro Minochkin | vulkan: throw system error instead of SIGABRT during... |
commit | commitdiff | tree |
| 2025-09-29 |
Jeff Bolz | vulkan: support GET_ROWS for k-quants (llama/16235) |
commit | commitdiff | tree |
| 2025-09-29 |
Aaron Teo | devops: add s390x & ppc64le CI (llama/15925) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | metal : report OOM errors (llama/16274) |
commit | commitdiff | tree |
| 2025-09-29 |
Adrien Gallouët | common : use cpp-httplib as a cURL alternative for... |
commit | commitdiff | tree |
| 2025-09-29 |
Aaron Teo | ggml-cpu: implement MXFP4 SIMD for s390x (llama/16193) |
commit | commitdiff | tree |
| 2025-09-29 |
R0CKSTAR | musa: fix build warnings (llama/15611) |
commit | commitdiff | tree |
| 2025-09-29 |
Aman Gupta | CUDA: add a fused top-K MoE kernel (llama/16130) |
commit | commitdiff | tree |
| 2025-09-29 |
junchao-zhao | ggml : fix loongarch lsx compilation error (llama/15864) |
commit | commitdiff | tree |
| 2025-09-29 |
Daniel Bevenius | ggml : remove -dev suffix from release version (ggml... |
commit | commitdiff | tree |
| 2025-09-29 |
Daniel Bevenius | ggml : bump version to 0.9.3 (ggml/1353) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | metal : fuse NORM + MUL + ADD, support non-multiples... |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | metal : relax reorder conditions (llama/16216) |
commit | commitdiff | tree |
| 2025-09-29 |
Georgi Gerganov | metal : restore im2col perf (llama/16219) |
commit | commitdiff | tree |
| 2025-09-29 |
Radoslav Gerganov | rpc : use ggml logging facilities |
commit | commitdiff | tree |
| 2025-09-29 |
Johannes Gäßler | llama: print memory breakdown on exit (llama/15860) |
commit | commitdiff | tree |
| 2025-09-29 |
Acly | ggml : split graph allocations according to backend... |
commit | commitdiff | tree |
| next |