]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2026-02-14 Max Krasnyanskyhexagon: Add ARGSORT, DIV, SQR, SQRT, SUM_ROWS, GEGLU...
2026-02-14 Georgi Gerganovggml : extend bin bcast for permuted src1 (llama/19484)
2026-02-14 Georgi Gerganovmetal : consolidate unary ops (llama/19490)
2026-02-14 Oliver SimonsCUDA : Update CCCL-tag for 3.2 to final release from...
2026-02-14 Nikhil JainPlug memory leaks and free resources on shutdown (llama...
2026-02-14 Xuan-Son Nguyentest: fix IMROPE perf test case (llama/19465)
2026-02-14 Alberto Cabrera... ggml-cpu: arm64: q6_K repack gemm and gemv (and generic...
2026-02-14 k4ss4nggml : use noexcept overload for is_regular_file in...
2026-02-14 Raul TorresCANN: Remove unnecessary wrapper for `gml_backend_buft_...
2026-02-14 hipuddingCANN: implement quantized MUL_MAT_ID for MoE models...
2026-02-14 Georgi Gerganovcuda : extend GGML_OP_PAD to work with non-cont src0...
2026-02-14 Oliver SimonsCUDA: Fix non-contig rope (llama/19338)
2026-02-07 Georgi Gerganovsync : llama.cpp
2026-02-07 Georgi Gerganovmetal : consolidate bin kernels (llama/19390)
2026-02-07 Georgi Gerganovmetal : fix event synchronization in cpy_tensor_async...
2026-02-07 Abhijit Rameshggml-webgpu: JIT compile binary operators and handle...
2026-02-07 Georgi Gerganovsync : llama.cpp
2026-02-07 Nechama Krashinskisycl: add F16 support for GGML_OP_CEIL (llama/19306)
2026-02-07 Jeff Bolztests: reduce number of FA test permutations (llama...
2026-02-07 Jeff Bolzvulkan: For coopmat2 FA, use fp16 accumulators for...
2026-02-07 Jeff Bolzvulkan: make FA mask/softcap enables spec constants...
2026-02-07 Georgi Gerganovmetal : skip loading all-zero mask (llama/19337)
2026-02-07 Georgi Gerganovcuda : cuda graphs now compare all node params (llama...
2026-02-07 Georgi Gerganovmetal : adaptive CPU/GPU interleave based on number...
2026-02-07 Jeff Bolzvulkan: Preprocess FA mask to detect all-neg-inf and...
2026-02-07 Georgi Gerganovmetal : add diag (llama/19330)
2026-02-07 Oleksandr Kuvshynovvulkan: fix GPU deduplication logic. (llama/19222)
2026-02-07 Jeff Bolzvulkan: Set k_load_shmem to false when K is too large...
2026-02-07 Jeff Bolzvulkan: fix non-contig rope (llama/19299)
2026-02-07 will-lmsmetal : add missing includes (llama/19348)
2026-02-07 Georgi Gerganovtests : add non-cont, inplace rope tests (llama/19296)
2026-02-07 Kevin Pougetggml-virtgpu: make the code thread safe (llama/19204)
2026-02-07 Aman Guptaggml-cpu: use LUT for converting e8->f32 scales on...
2026-02-07 Georgi Gerganovmetal : add solve_tri (llama/19302)
2026-02-07 Ruben Ortlamvulkan: disable coopmat1 fa on Nvidia Turing (llama...
2026-02-07 Aman GuptaCUDA: use mmvq for mul-mat-id for small batch sizes...
2026-02-07 Georgi Gerganovmetal : minor cleanup (llama/19251)
2026-02-07 Oliver SimonsCUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_f...
2026-02-07 Georgeggml: added cleanups in ggml_quantize_free (llama/19278)
2026-02-07 Gaurav Gargcuda : revert CUDA_SCALE_LAUNCH_QUEUES override until...
2026-02-07 lhezopencl: refactor some ops, concat, repeat, tanh and...
2026-02-07 Aman Guptaggml-cpu: FA split across kv for faster TG (llama/19209)
2026-02-07 Neo ZhangRemove support for Nvidia & AMD GPU, because the oneAPI...
2026-02-07 Tamarsycl: implement GGML_OP_TOP_K (llama/19242)
2026-02-07 Georgi Gerganovmetal : support virtual devices (llama/18919)
2026-02-07 Johannes Gäßlerggml-backend: fix async set/get fallback sync (llama...
2026-02-07 Christian Kastnerdocs : Minor cleanups (llama/19252)
2026-02-07 Nikhil JainRemove pipeline cache mutexes (llama/19195)
2026-02-07 Max KrasnyanskyBump cmake max version (needed for Windows on Snapdrago...
2026-02-07 nullnameggml-hexagon: flash-attention and reduce-sum optimizati...
2026-02-07 shaofeiqiopencl: add optimized q8_0 mm kernel for adreno (llama...
2026-02-07 Simon RedmanCorrectly fetch q8_1 quantize pipeline in test as neede...
2026-02-07 Georgi Gerganovtests : add GQA=20 FA test (llama/19095)
2026-02-07 Georgi Gerganovci : remove "Release" word from the title of the release
2026-02-07 Georgi Gerganovggml : bump version to 0.9.6 (#1423) v0.9.6
2026-01-30 Georgi Gerganovcmake : remove unused file (#1419)
2026-01-30 Georgi Gerganovsync : whisper.cpp
2026-01-30 Georgi Gerganovcuda : fix compile warnings (whisper/0)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 bssrdfadd tensor type checking as part of cuda graph properti...
2026-01-30 s8322sycl: implement GGML_UNARY_OP_SOFTPLUS (llama/19114)
2026-01-30 RachelMantelsycl: implement GGML_OP_TRI (llama/19089)
2026-01-30 Zheyuan Chenggml-webgpu: improve flastAttention performance by...
2026-01-30 Todor Boinovskihexagon: enable offloading to Hexagon on Windows on...
2026-01-30 Georgi Gerganovcuda : fix nkvo, offload and cuda graph node properties...
2026-01-30 yuloHIP: add mmf for CDNA (llama/18896)
2026-01-30 Vishal Singhggml-zendnn : resolve ZenDNN backend cross-module symbo...
2026-01-30 Aman GuptaCUDA: refactor topk-moe to enable more models (GLM...
2026-01-30 Neo Zhangsycl: fix norm kernels: l2_norm, group_norm, rms_norm...
2026-01-30 Ruben OrtlamVulkan Flash Attention Coopmat1 Refactor (llama/19075)
2026-01-30 Patryk Kaminskiggml-sycl: remove unused syclcompat header (llama/19140)
2026-01-30 Oleksandr Kuvshynovvulkan: handle device dedup on MacOS + Vega II Duo...
2026-01-30 Kevin Pougetggml: new backend for Virglrenderer API Remoting accele...
2026-01-30 Alberto Cabrera... ggml-cpu: arm64: Q4_K scale unroll and vectorization...
2026-01-30 Georgi Gerganovcuda : fix "V is K view" check for non-unified KV cache...
2026-01-30 Georgi GerganovCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-30 Nikhil Jainggml webgpu: Split shared state (webgpu_context) into...
2026-01-30 Vishal Singhggml-zendnn : update ZenDNN git tag to main branch...
2026-01-30 Johannes GäßlerCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q6_K repack gemm and gemv (and generi...
2026-01-30 Gaurav GargReduce CPU-side stalls due to the CUDA command buffer...
2026-01-30 shalinib-ibmggml-cpu: Enable FP16 MMA kernels on PPC (llama/19060)
2026-01-30 lhezopencl: add flattened q6_K mv (llama/19054)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 Johannes GäßlerCUDA: fix padding of GQA to power of 2 in FA (llama...
2026-01-30 Johannes GäßlerCUDA: faster FA for GQA > 1 but not power of 2 (llama...
2026-01-30 ccbinnmetal : fix recommendedMaxWorkingSetSize availability...
2026-01-30 Aman Guptaggml-cpu: Use tiled FA for prompt-processing (llama...
2026-01-30 Georgi Gerganovkv-cache : support V-less cache (llama/19067)
2026-01-30 Johannes GäßlerCUDA: re-use MLA K data for V in MMA FA (llama/19057)
2026-01-30 Aman Guptaggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama...
2026-01-30 nullnameggml-hexagon: flash-attn opt (llama/19025)
2026-01-30 Neo Zhanguse malloc to support both iGPU and dGPU in same time...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q5_K repack gemm and gemv (and generi...
2026-01-30 Georgi Gerganovmla : make the V tensor a view of K (llama/18986)
2026-01-30 Johannes GäßlerCUDA: fix alignment check for FA (llama/19023)
2026-01-30 lhezopencl: enable the general fp mm for non-cont input...
2026-01-30 Aman GuptaCUDA: add gqa_ratio 4 for GLM 4.7 flash (llama/18953)
2026-01-30 shaofeiqiopencl: add TRI op support (llama/18979)
2026-01-30 Aleksei Nikiforovggml-zdnn : mark zDNN buffers as non-host (llama/18967)
next