]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2026-02-07 Oleksandr Kuvshynovvulkan: fix GPU deduplication logic. (llama/19222)
2026-02-07 Jeff Bolzvulkan: Set k_load_shmem to false when K is too large...
2026-02-07 Jeff Bolzvulkan: fix non-contig rope (llama/19299)
2026-02-07 will-lmsmetal : add missing includes (llama/19348)
2026-02-07 Georgi Gerganovtests : add non-cont, inplace rope tests (llama/19296)
2026-02-07 Kevin Pougetggml-virtgpu: make the code thread safe (llama/19204)
2026-02-07 Aman Guptaggml-cpu: use LUT for converting e8->f32 scales on...
2026-02-07 Georgi Gerganovmetal : add solve_tri (llama/19302)
2026-02-07 Ruben Ortlamvulkan: disable coopmat1 fa on Nvidia Turing (llama...
2026-02-07 Aman GuptaCUDA: use mmvq for mul-mat-id for small batch sizes...
2026-02-07 Georgi Gerganovmetal : minor cleanup (llama/19251)
2026-02-07 Oliver SimonsCUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_f...
2026-02-07 Georgeggml: added cleanups in ggml_quantize_free (llama/19278)
2026-02-07 Gaurav Gargcuda : revert CUDA_SCALE_LAUNCH_QUEUES override until...
2026-02-07 lhezopencl: refactor some ops, concat, repeat, tanh and...
2026-02-07 Aman Guptaggml-cpu: FA split across kv for faster TG (llama/19209)
2026-02-07 Neo ZhangRemove support for Nvidia & AMD GPU, because the oneAPI...
2026-02-07 Tamarsycl: implement GGML_OP_TOP_K (llama/19242)
2026-02-07 Georgi Gerganovmetal : support virtual devices (llama/18919)
2026-02-07 Johannes Gäßlerggml-backend: fix async set/get fallback sync (llama...
2026-02-07 Christian Kastnerdocs : Minor cleanups (llama/19252)
2026-02-07 Nikhil JainRemove pipeline cache mutexes (llama/19195)
2026-02-07 Max KrasnyanskyBump cmake max version (needed for Windows on Snapdrago...
2026-02-07 nullnameggml-hexagon: flash-attention and reduce-sum optimizati...
2026-02-07 shaofeiqiopencl: add optimized q8_0 mm kernel for adreno (llama...
2026-02-07 Simon RedmanCorrectly fetch q8_1 quantize pipeline in test as neede...
2026-02-07 Georgi Gerganovtests : add GQA=20 FA test (llama/19095)
2026-02-07 Georgi Gerganovci : remove "Release" word from the title of the release
2026-02-07 Georgi Gerganovggml : bump version to 0.9.6 (#1423) v0.9.6
2026-01-30 Georgi Gerganovcmake : remove unused file (#1419)
2026-01-30 Georgi Gerganovsync : whisper.cpp
2026-01-30 Georgi Gerganovcuda : fix compile warnings (whisper/0)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 bssrdfadd tensor type checking as part of cuda graph properti...
2026-01-30 s8322sycl: implement GGML_UNARY_OP_SOFTPLUS (llama/19114)
2026-01-30 RachelMantelsycl: implement GGML_OP_TRI (llama/19089)
2026-01-30 Zheyuan Chenggml-webgpu: improve flastAttention performance by...
2026-01-30 Todor Boinovskihexagon: enable offloading to Hexagon on Windows on...
2026-01-30 Georgi Gerganovcuda : fix nkvo, offload and cuda graph node properties...
2026-01-30 yuloHIP: add mmf for CDNA (llama/18896)
2026-01-30 Vishal Singhggml-zendnn : resolve ZenDNN backend cross-module symbo...
2026-01-30 Aman GuptaCUDA: refactor topk-moe to enable more models (GLM...
2026-01-30 Neo Zhangsycl: fix norm kernels: l2_norm, group_norm, rms_norm...
2026-01-30 Ruben OrtlamVulkan Flash Attention Coopmat1 Refactor (llama/19075)
2026-01-30 Patryk Kaminskiggml-sycl: remove unused syclcompat header (llama/19140)
2026-01-30 Oleksandr Kuvshynovvulkan: handle device dedup on MacOS + Vega II Duo...
2026-01-30 Kevin Pougetggml: new backend for Virglrenderer API Remoting accele...
2026-01-30 Alberto Cabrera... ggml-cpu: arm64: Q4_K scale unroll and vectorization...
2026-01-30 Georgi Gerganovcuda : fix "V is K view" check for non-unified KV cache...
2026-01-30 Georgi GerganovCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-30 Nikhil Jainggml webgpu: Split shared state (webgpu_context) into...
2026-01-30 Vishal Singhggml-zendnn : update ZenDNN git tag to main branch...
2026-01-30 Johannes GäßlerCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q6_K repack gemm and gemv (and generi...
2026-01-30 Gaurav GargReduce CPU-side stalls due to the CUDA command buffer...
2026-01-30 shalinib-ibmggml-cpu: Enable FP16 MMA kernels on PPC (llama/19060)
2026-01-30 lhezopencl: add flattened q6_K mv (llama/19054)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Johannes GäßlerCUDA: fix padding of GQA to power of 2 in FA (llama...
2026-01-30 Johannes GäßlerCUDA: faster FA for GQA > 1 but not power of 2 (llama...
2026-01-30 ccbinnmetal : fix recommendedMaxWorkingSetSize availability...
2026-01-30 Aman Guptaggml-cpu: Use tiled FA for prompt-processing (llama...
2026-01-30 Georgi Gerganovkv-cache : support V-less cache (llama/19067)
2026-01-30 Johannes GäßlerCUDA: re-use MLA K data for V in MMA FA (llama/19057)
2026-01-30 Aman Guptaggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama...
2026-01-30 nullnameggml-hexagon: flash-attn opt (llama/19025)
2026-01-30 Neo Zhanguse malloc to support both iGPU and dGPU in same time...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q5_K repack gemm and gemv (and generi...
2026-01-30 Georgi Gerganovmla : make the V tensor a view of K (llama/18986)
2026-01-30 Johannes GäßlerCUDA: fix alignment check for FA (llama/19023)
2026-01-30 lhezopencl: enable the general fp mm for non-cont input...
2026-01-30 Aman GuptaCUDA: add gqa_ratio 4 for GLM 4.7 flash (llama/18953)
2026-01-30 shaofeiqiopencl: add TRI op support (llama/18979)
2026-01-30 Aleksei Nikiforovggml-zdnn : mark zDNN buffers as non-host (llama/18967)
2026-01-30 Jeff Bolzvulkan: Remove transfer_ctx, do everything in compute_c...
2026-01-30 Jeff Bolzvulkan: support flash attention GQA/split_k with small...
2026-01-30 Masato NakasakaRevert "vulkan: force full subgroups for flash attentio...
2026-01-30 Jeff Bolzvulkan: Use mul_mat_vec_id for small values of n (llama...
2026-01-30 Oliver SimonsCUDA: Fix builds for older CCCL versions by ifdefing...
2026-01-30 Oliver SimonsCUDA: Replace init_offsets kernel with iterators in...
2026-01-30 Adrien Gallouëtggml : cleanup path_str() (llama/18928)
2026-01-30 Georgi Gerganovmetal : enable FA for MLA heads (llama/18950)
2026-01-30 Georgi Gerganovggml : add ggml_build_forward_select (llama/18550)
2026-01-30 lhezopencl: fix q6_K mv for m=1 (llama/18893)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Reese Levineggml webgpu: support for backend sampling (llama/18880)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Thore Koritziusggml : extend ggml_pool_1d + metal (llama/16429)
2026-01-30 Perry Naseckggml-blas: hide warnings from included BLAS headers...
2026-01-30 Raul TorresCANN: Remove unused `ggml_cann_get_device` function...
2026-01-30 Chenguang LiCANN: fix an issue where get_env was not fully renamed...
2026-01-30 hipuddingCANN: support gated linear attn (llama/18653)
2026-01-30 shaofeiqiOpenCL: add SOLVE_TRI op support (llama/18846)
2026-01-30 Georgi Gerganovcuda : print less debug logs when disabling cuda graphs...
2026-01-30 Johannes GäßlerCUDA: fix allignment on register spill for FA (llama...
2026-01-30 shalinib-ibmggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (llama...
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Max Krasnyanskyhexagon: support for OP_CPY, host buffers now optional...
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Oliver SimonsCUDA: Factor out and re-use `block_reduce` function...
next