]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2026-02-07 Nikhil JainRemove pipeline cache mutexes (llama/19195)
2026-02-07 Max KrasnyanskyBump cmake max version (needed for Windows on Snapdrago...
2026-02-07 nullnameggml-hexagon: flash-attention and reduce-sum optimizati...
2026-02-07 shaofeiqiopencl: add optimized q8_0 mm kernel for adreno (llama...
2026-02-07 Simon RedmanCorrectly fetch q8_1 quantize pipeline in test as neede...
2026-02-07 Georgi Gerganovtests : add GQA=20 FA test (llama/19095)
2026-02-07 Georgi Gerganovci : remove "Release" word from the title of the release
2026-02-07 Georgi Gerganovggml : bump version to 0.9.6 (#1423) v0.9.6
2026-01-30 Georgi Gerganovcmake : remove unused file (#1419)
2026-01-30 Georgi Gerganovsync : whisper.cpp
2026-01-30 Georgi Gerganovcuda : fix compile warnings (whisper/0)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 bssrdfadd tensor type checking as part of cuda graph properti...
2026-01-30 s8322sycl: implement GGML_UNARY_OP_SOFTPLUS (llama/19114)
2026-01-30 RachelMantelsycl: implement GGML_OP_TRI (llama/19089)
2026-01-30 Zheyuan Chenggml-webgpu: improve flastAttention performance by...
2026-01-30 Todor Boinovskihexagon: enable offloading to Hexagon on Windows on...
2026-01-30 Georgi Gerganovcuda : fix nkvo, offload and cuda graph node properties...
2026-01-30 yuloHIP: add mmf for CDNA (llama/18896)
2026-01-30 Vishal Singhggml-zendnn : resolve ZenDNN backend cross-module symbo...
2026-01-30 Aman GuptaCUDA: refactor topk-moe to enable more models (GLM...
2026-01-30 Neo Zhangsycl: fix norm kernels: l2_norm, group_norm, rms_norm...
2026-01-30 Ruben OrtlamVulkan Flash Attention Coopmat1 Refactor (llama/19075)
2026-01-30 Patryk Kaminskiggml-sycl: remove unused syclcompat header (llama/19140)
2026-01-30 Oleksandr Kuvshynovvulkan: handle device dedup on MacOS + Vega II Duo...
2026-01-30 Kevin Pougetggml: new backend for Virglrenderer API Remoting accele...
2026-01-30 Alberto Cabrera... ggml-cpu: arm64: Q4_K scale unroll and vectorization...
2026-01-30 Georgi Gerganovcuda : fix "V is K view" check for non-unified KV cache...
2026-01-30 Georgi GerganovCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-30 Nikhil Jainggml webgpu: Split shared state (webgpu_context) into...
2026-01-30 Vishal Singhggml-zendnn : update ZenDNN git tag to main branch...
2026-01-30 Johannes GäßlerCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q6_K repack gemm and gemv (and generi...
2026-01-30 Gaurav GargReduce CPU-side stalls due to the CUDA command buffer...
2026-01-30 shalinib-ibmggml-cpu: Enable FP16 MMA kernels on PPC (llama/19060)
2026-01-30 lhezopencl: add flattened q6_K mv (llama/19054)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Johannes GäßlerCUDA: fix padding of GQA to power of 2 in FA (llama...
2026-01-30 Johannes GäßlerCUDA: faster FA for GQA > 1 but not power of 2 (llama...
2026-01-30 ccbinnmetal : fix recommendedMaxWorkingSetSize availability...
2026-01-30 Aman Guptaggml-cpu: Use tiled FA for prompt-processing (llama...
2026-01-30 Georgi Gerganovkv-cache : support V-less cache (llama/19067)
2026-01-30 Johannes GäßlerCUDA: re-use MLA K data for V in MMA FA (llama/19057)
2026-01-30 Aman Guptaggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama...
2026-01-30 nullnameggml-hexagon: flash-attn opt (llama/19025)
2026-01-30 Neo Zhanguse malloc to support both iGPU and dGPU in same time...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q5_K repack gemm and gemv (and generi...
2026-01-30 Georgi Gerganovmla : make the V tensor a view of K (llama/18986)
2026-01-30 Johannes GäßlerCUDA: fix alignment check for FA (llama/19023)
2026-01-30 lhezopencl: enable the general fp mm for non-cont input...
2026-01-30 Aman GuptaCUDA: add gqa_ratio 4 for GLM 4.7 flash (llama/18953)
2026-01-30 shaofeiqiopencl: add TRI op support (llama/18979)
2026-01-30 Aleksei Nikiforovggml-zdnn : mark zDNN buffers as non-host (llama/18967)
2026-01-30 Jeff Bolzvulkan: Remove transfer_ctx, do everything in compute_c...
2026-01-30 Jeff Bolzvulkan: support flash attention GQA/split_k with small...
2026-01-30 Masato NakasakaRevert "vulkan: force full subgroups for flash attentio...
2026-01-30 Jeff Bolzvulkan: Use mul_mat_vec_id for small values of n (llama...
2026-01-30 Oliver SimonsCUDA: Fix builds for older CCCL versions by ifdefing...
2026-01-30 Oliver SimonsCUDA: Replace init_offsets kernel with iterators in...
2026-01-30 Adrien Gallouëtggml : cleanup path_str() (llama/18928)
2026-01-30 Georgi Gerganovmetal : enable FA for MLA heads (llama/18950)
2026-01-30 Georgi Gerganovggml : add ggml_build_forward_select (llama/18550)
2026-01-30 lhezopencl: fix q6_K mv for m=1 (llama/18893)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Reese Levineggml webgpu: support for backend sampling (llama/18880)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Thore Koritziusggml : extend ggml_pool_1d + metal (llama/16429)
2026-01-30 Perry Naseckggml-blas: hide warnings from included BLAS headers...
2026-01-30 Raul TorresCANN: Remove unused `ggml_cann_get_device` function...
2026-01-30 Chenguang LiCANN: fix an issue where get_env was not fully renamed...
2026-01-30 hipuddingCANN: support gated linear attn (llama/18653)
2026-01-30 shaofeiqiOpenCL: add SOLVE_TRI op support (llama/18846)
2026-01-30 Georgi Gerganovcuda : print less debug logs when disabling cuda graphs...
2026-01-30 Johannes GäßlerCUDA: fix allignment on register spill for FA (llama...
2026-01-30 shalinib-ibmggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (llama...
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Max Krasnyanskyhexagon: support for OP_CPY, host buffers now optional...
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Oliver SimonsCUDA: Factor out and re-use `block_reduce` function...
2026-01-30 Jeff Bolzvulkan: Check maxStorageBufferRange in supports_op...
2026-01-30 Daniel BeveniusCUDA : fix typo in clang pragma comment [no ci] (llama...
2026-01-30 Ruben Ortlamvulkan: work around Intel fp16 bug in mmq (llama/18814)
2026-01-30 Perry Naseckggml-metal: do not copy headers for embedded, use curre...
2026-01-30 yuloHIP: add fattn-mma-f16 for RDNA4 (llama/18481)
2026-01-13 Georgi Gerganovsync : llama.cpp
2026-01-13 Georgi GerganovCUDA : fix unused argument when USE_CUDA_GRAPH=OFF...
2026-01-13 Jeff Bolzvulkan: change memory_logger to be controlled by an...
2026-01-13 Jeff Bolzvulkan: Use VK_EXT_shader_64bit_indexing to handle...
2026-01-13 Ruben Ortlamvulkan: Disable large coopmat matmul configuration...
2026-01-13 Ruben OrtlamVulkan: Optimize Matmul parameters for AMD GPUs with...
2026-01-11 Georgi Gerganovsync : llma.cpp
2026-01-11 shaofeiqiopencl: add SOFTPLUS op support (llama/18726)
2026-01-11 Aman Guptatest-backend-ops: fix mxfp4 tests on blackwell (llama...
2026-01-11 Johannes GäßlerHIP: adjust RDNA3.5 MMQ kernel selction logic (llama...
2026-01-11 Perry Naseckcmake : update blas logic (llama/18205)
2026-01-11 Michael WandCorrected: changed s13 = src1->nb[3] instead of nb...
2026-01-11 shaofeiqiopencl: add EXPM1 op (llama/18704)
2026-01-11 Reese LevineUpdates to webgpu get_memory (llama/18707)
2026-01-11 Georgi Gerganovsync : llama.cpp
2026-01-11 Aaron Teollama: use host memory if device reports 0 memory ...
next