]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2025-11-09 Reese Levineggml webgpu: minor set rows optimization (llama/16810)
2025-11-09 Georgi Gerganovsync : llama.cpp
2025-11-09 nullnamerefactor: replace sprintf with snprintf for safer strin...
2025-11-09 Jeff Bolzvulkan: remove the need for the dryrun (llama/16826)
2025-11-09 Aclyggml-cpu : bicubic interpolation (llama/16891)
2025-11-09 NoahFix garbled output with REPACK at high thread counts...
2025-11-09 Aman GuptaCUDA: avoid mul + bias fusion when doing fusion (llama...
2025-11-09 lhezopencl: support imrope (llama/16914)
2025-11-09 theo77186ggml: CUDA: add head size 72 for flash-attn (llama...
2025-11-09 Jinyang Heggml : LoongArch fixes (llama/16958)
2025-11-09 shani-fSYCL: optimized repeat_back kernel (3× fewer asm instru...
2025-11-09 Shagun Beratest-backend-ops : fix segfault in moe-expert-reduce...
2025-11-09 Georgi Gerganovclip : use FA (llama/16837)
2025-11-09 mnehete32CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (llama...
2025-11-09 Aaron Teoggml: add s390x cpu-feats (llama/16774)
2025-11-09 Jeff Bolzvulkan: Fix multi_add invalid descriptor usage (llama...
2025-11-09 Jeff Bolzvulkan: fuse mul_mat+add and mul_mat_id+add_id (llama...
2025-11-09 Oliver SimonsCUDA: Remove unneded bias/gate dims in fused mmvq ...
2025-11-09 Johannes GäßlerCUDA: Volta tensor core support for MMF (llama/16843)
2025-11-04 Georgi Gerganovggml : fix conv2d_dw SVE path (#1380)
2025-11-01 Georgi Gerganovsync : llama.cpp
2025-11-01 Aman GuptaCUDA: add expert reduce kernel (llama/16857)
2025-11-01 Jeff Bolzvulkan: disable spirv-opt for rope shaders (llama/16872)
2025-11-01 Masato Nakasakavulkan: Fix crash when FP16 mul_mat accumulation is...
2025-11-01 Ruben Ortlamvulkan: fix shmem overrun in mmq id shader (llama/16873)
2025-11-01 l3utterflyggml-hexagon: respect input size when getting/setting...
2025-11-01 lhezopencl: fix boundary handling for mul_mm (llama/16875)
2025-11-01 Max Krasnyanskycpu: introduce chunking for repack matmuls and enable...
2025-11-01 JJJYmmmmodel: add support for qwen3vl series (llama/16780)
2025-11-01 Max Krasnyanskycpu: introduce chunking for flash attention (llama...
2025-11-01 Sigbjørn Skjæretcuda : fix argsort with 64k+ rows (llama/16849)
2025-11-01 Jeff Bolzvulkan: Handle argsort with a large number of rows...
2025-11-01 Oliver SimonsHide latency of bias and gate-loading (llama/16847)
2025-11-01 Jeff Bolzvulkan: Fuse rope+set_rows (llama/16769)
2025-11-01 Jeff Bolzvulkan: Update topk_moe fusion to handle gpt's late...
2025-11-01 Ruben OrtlamVulkan MMQ Integer Dot Refactor and K-Quant support...
2025-11-01 Max KrasnyanskyHexagon Op queue & dispatch optimizations (llama/16820)
2025-11-01 Aman GuptaCUDA: use fastdiv in set-rows (llama/16834)
2025-11-01 Jeff Bolzvulkan: Call ggml_vk_buffer_write_2d from ggml_vk_buffe...
2025-11-01 Aman GuptaCUDA: Fix bug in topk-moe for gpt-oss (llama/16821)
2025-11-01 YaelLogicsycl: add RMS_NORM_BACK operation support (llama/16808)
2025-11-01 YaelGitAccountcuda: add SET operation support (llama/16804)
2025-11-01 l3utterflyinitialise buffer.device in ggml_hexagon_session (llama...
2025-11-01 Chenguang LiCANN: Improve device ID handling and aclnnArange checks...
2025-11-01 Aman GuptaCUDA: add unused vars to mmvf and mmvq (llama/16807)
2025-11-01 tamarPalsycl: add SSM_CONV operation support (llama/16800)
2025-11-01 Aclyggml : fix interpolate with align-corners and ne=1...
2025-11-01 Johannes GäßlerHIP: fix AMDGPU_TARGETS, update documentation (llama...
2025-11-01 Aman Guptatest-backend-ops: print failed tests at the end (llama...
2025-11-01 tamarPalsycl: add ROLL operation support (llama/16665)
2025-11-01 shani-fsycl: add REPEAT_BACK operation support (llama/16734)
2025-11-01 Aman GuptaCUDA: support for weight clamp in top-k norm (llama...
2025-11-01 Aclyggml-alloc : make gallocr prefer chunks that allow...
2025-11-01 Sigbjørn Skjæretcuda : use fast copy when src and dst are of different...
2025-11-01 leejetggml: fix cuda kernel launch configuration for k_comput...
2025-11-01 Aman GuptaCUDA: General GEMV fusion (llama/16715)
2025-11-01 Gilad S.vulkan: deduplicate Microsoft Direct3D12 devices (llama...
2025-11-01 Giuseppe Scrivanovulkan: delete dead code (llama/16732)
2025-11-01 Jeff Bolzvulkan: Optimize SSM_SCAN (llama/16645)
2025-11-01 leejetggml: fix CUDA grid launch condition for large block_nu...
2025-11-01 Aman GuptaCUDA: use CUB for arbitary size argsort (llama/16754)
2025-11-01 Aman Guptaggml-cuda: use passed ops instead of hardcoded ops...
2025-11-01 Matthew Michelsycl: use async memory allocation to fix crashes during...
2025-11-01 Max KrasnyanskyAdd experimental ggml-hexagon backend for the Hexagon...
2025-11-01 Diego DevesaRevert "ggml : Leverage the existing GGML_F32_VEC helpe...
2025-11-01 sirus20x6ggml : Leverage the existing GGML_F32_VEC helpers to...
2025-11-01 Aman GuptaCUDA: fix bug in topk-moe softmax (llama/16711)
2025-11-01 Aman GuptaCUDA: topk-moe: add optional parameter for gpt-oss...
2025-11-01 Johannes GäßlerCUDA: better error for FA kernel with 0 occupancy ...
2025-10-29 Jeff BolzRewrite simple-backend to use sched and ggml_backend_lo...
2025-10-22 Georgi Gerganovsync : whisper.cpp
2025-10-21 Georgi Gerganovsync : llama.cpp
2025-10-21 Aman Guptaggml: add ggml_can_fuse_subgraph (llama/16662)
2025-10-21 lhezopencl: fix warnings and clean up profiling (llama...
2025-10-21 Jeff Bolzvulkan: Handle FA with all -inf mask values (llama...
2025-10-21 YehuditEsycl : add PAD_REFLECT_D1 operator support (llama/16145)
2025-10-21 Diego Devesaggml-alloc : fix leak when reusing a tensor with a...
2025-10-21 safranowithSYCL: Add support for FLOOR,CEIL,ROUND and TRUNC unary...
2025-10-21 Aaron Teoci : fix binaries release failure for s390x (binaries...
2025-10-21 Johannes GäßlerHIP: fix GPU_TARGETS (llama/16642)
2025-10-21 Jeff Bolzvulkan: Implement topk_moe fused shader, ported from...
2025-10-21 Aman GuptaCUDA: use registers instead of smem in topk-moe (llama...
2025-10-21 Shawn Guopencl: transposed gemm/gemv moe kernel with mxfp4...
2025-10-21 Radoslav Gerganovrpc : report actual free memory (llama/16616)
2025-10-21 Giuseppe Scrivanovulkan: Add State Space Model (SSM) Operations Support...
2025-10-21 muggle-stackggml : fix SpaceMit IME array out-of-bounds in task...
2025-10-21 Jeff Bolzvulkan: fix debug build (add_rms_len/data not found...
2025-10-21 Ilia Ilmermetal : add `CONV_TRANSPOSE_2D` (llama/16542)
2025-10-21 GittyBursteinSYCL SET operator optimized for F32 tensors (llama...
2025-10-21 GittyBursteinsycl : add ARANGE operator (llama/16362)
2025-10-21 Chenguang LiCANN: format code using .clang-format (llama/15863)
2025-10-21 takuya kodamaggml-cpu: replace putenv with setenv for const-correctn...
2025-10-21 yael-worksSYCL: Add GGML_OP_MEAN operator support (llama/16009)
2025-10-21 safranowithcpu : add FLOOR, CEIL, ROUND and TRUNC unary operators...
2025-10-21 lhezopencl: add q8_0 mm support (llama/16469)
2025-10-21 lhezopencl: fix FA for f32 (llama/16584)
2025-10-21 Sam/Samuelmetal: optimise `GGML_OP_SUM` (llama/16559)
2025-10-21 Julius TischbeinCUDA: Changing the CUDA scheduling strategy to spin...
2025-10-21 Georgi Gerganovmetal : avoid using Metal's gpuAddress property (llama...
2025-10-14 Georgi Gerganovsync : llama.cpp upstream/latest upstream/0.9.4.58
next