]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2025-10-29 Jeff BolzRewrite simple-backend to use sched and ggml_backend_lo...
2025-10-22 Georgi Gerganovsync : whisper.cpp
2025-10-21 Georgi Gerganovsync : llama.cpp
2025-10-21 Aman Guptaggml: add ggml_can_fuse_subgraph (llama/16662)
2025-10-21 lhezopencl: fix warnings and clean up profiling (llama...
2025-10-21 Jeff Bolzvulkan: Handle FA with all -inf mask values (llama...
2025-10-21 YehuditEsycl : add PAD_REFLECT_D1 operator support (llama/16145)
2025-10-21 Diego Devesaggml-alloc : fix leak when reusing a tensor with a...
2025-10-21 safranowithSYCL: Add support for FLOOR,CEIL,ROUND and TRUNC unary...
2025-10-21 Aaron Teoci : fix binaries release failure for s390x (binaries...
2025-10-21 Johannes GäßlerHIP: fix GPU_TARGETS (llama/16642)
2025-10-21 Jeff Bolzvulkan: Implement topk_moe fused shader, ported from...
2025-10-21 Aman GuptaCUDA: use registers instead of smem in topk-moe (llama...
2025-10-21 Shawn Guopencl: transposed gemm/gemv moe kernel with mxfp4...
2025-10-21 Radoslav Gerganovrpc : report actual free memory (llama/16616)
2025-10-21 Giuseppe Scrivanovulkan: Add State Space Model (SSM) Operations Support...
2025-10-21 muggle-stackggml : fix SpaceMit IME array out-of-bounds in task...
2025-10-21 Jeff Bolzvulkan: fix debug build (add_rms_len/data not found...
2025-10-21 Ilia Ilmermetal : add `CONV_TRANSPOSE_2D` (llama/16542)
2025-10-21 GittyBursteinSYCL SET operator optimized for F32 tensors (llama...
2025-10-21 GittyBursteinsycl : add ARANGE operator (llama/16362)
2025-10-21 Chenguang LiCANN: format code using .clang-format (llama/15863)
2025-10-21 takuya kodamaggml-cpu: replace putenv with setenv for const-correctn...
2025-10-21 yael-worksSYCL: Add GGML_OP_MEAN operator support (llama/16009)
2025-10-21 safranowithcpu : add FLOOR, CEIL, ROUND and TRUNC unary operators...
2025-10-21 lhezopencl: add q8_0 mm support (llama/16469)
2025-10-21 lhezopencl: fix FA for f32 (llama/16584)
2025-10-21 Sam/Samuelmetal: optimise `GGML_OP_SUM` (llama/16559)
2025-10-21 Julius TischbeinCUDA: Changing the CUDA scheduling strategy to spin...
2025-10-21 Georgi Gerganovmetal : avoid using Metal's gpuAddress property (llama...
2025-10-14 Georgi Gerganovsync : llama.cpp upstream/latest upstream/0.9.4.58
2025-10-14 SavicStefanvulkan: Add ACC_TYPE_VEC2 implementation (llama/16203)
2025-10-14 Aman GuptaCUDA + openCL: fix bug in accessing rms_norm->src while...
2025-10-14 Jeff Bolzvulkan: Support FA with K/V in F32 (llama/16543)
2025-10-14 Jeff Bolzvulkan: Improve build time for MSVC (llama/16545)
2025-10-14 Johannes GäßlerCUDA: enable FA for FP32 KV cache (llama/16546)
2025-10-14 Aman GuptaCUDA: use fastdiv + ggml_cuda_mad for mmvf (llama/16557)
2025-10-14 Aman GuptaCUDA: add fp kernel for larger batch size MoE (llama...
2025-10-14 Anav Prasadcuda : remove legacy copy-op pointer indirection code...
2025-10-14 Georgi Gerganovmetal : FA support F32 K and V and head size = 32 ...
2025-10-14 lhezopencl: fix build targeting CL 2 (llama/16554)
2025-10-14 Johannes GäßlerCUDA: fix numerical issues in tile FA kernel (llama...
2025-10-14 Jie Fu (傅杰)ggml : fix build broken with -march=armv9-a on MacOS...
2025-10-14 Chenguang LiCANN: fix CPU memory leak in CANN backend (llama/16549)
2025-10-14 Sam/Samuelmetal: add support for opt_step_sgd (llama/16539)
2025-10-14 Georgi Gerganovggml : fix scalar path for computing norm (llama/16558)
2025-10-14 hipuddingCANN: Update several operators to support FP16 data...
2025-10-14 Sam/Samuelmetal : add opt_step_adamw and op_sum (llama/16529)
2025-10-14 Neo Zhang Jianyufix UT fault cases: count-equal, argsort, pad OPs ...
2025-10-14 sirus20x6ggml : Fix FP16 ELU positive branch (llama/16519)
2025-10-14 sirus20x6ggml: Correct SVE implementation in ggml_vec_dot_f16_un...
2025-10-14 Johannes GäßlerCUDA: faster tile FA, add oob checks, more HSs (llama...
2025-10-12 Georgi Gerganovsync : llama.cpp
2025-10-12 Georgi Gerganovmetal : fix mul-mm condition + fix mul-mv permuted...
2025-10-12 Diego Devesacuda : avoid initializing unused devices (llama/16510)
2025-10-12 Prajwal B Mehendarkarcmake : Dont define XOPENSOURCE on AIX (llama/16481)
2025-10-12 dudutacpu : optimize the ggml NORM operation (llama/15953)
2025-10-12 Chenguang LiCANN: Improve ACL graph matching (llama/16166)
2025-10-12 Charles Xukleidiai: kernel interface refactoring (llama/16460)
2025-10-12 Neo Zhang Jianyurefactor soft_max, add soft_max_back (llama/16472)
2025-10-12 ai-fonsiDisable CUDA host buffers on integrated GPUs (llama...
2025-10-12 Georgi Gerganovmetal : mark FA blocks (llama/16372)
2025-10-12 Reese Levineggml webgpu: profiling, CI updates, reworking of comman...
2025-10-12 Georgi Gerganovmetal : add support for non-padded FA KV (llama/16148)
2025-10-12 Georgi Gerganovtests : add -INF blocks to the KQ mask in the FA tests...
2025-10-12 Georgi Gerganovmetal : various optimizations + refactoring (llama...
2025-10-12 Georgi Gerganovggml : fix unaligned access in AMX code (llama/16315)
2025-10-12 Daniel Beveniusggml-cpu : fix leftover handling in ggml_vec_scale_f32...
2025-10-12 Reese Levineggml webgpu: actually add softmax, fix rms_norm offset...
2025-10-12 Evevulkan: use a more appropriate amount of threads when...
2025-10-12 Radoslav Gerganovrpc : check src buffer when copying tensor (llama/16421)
2025-10-12 Radoslav Gerganovrpc : add support for multiple devices (llama/16276)
2025-10-12 Georgi Gerganovsync : llama.cpp
2025-10-12 Aclyvulkan : incremental shader builds (llama/16341)
2025-10-12 Georgi Gerganovsync : llama.cpp
2025-10-12 Georgi Gerganovmetal : fix loop bound in ggml_mem_ranges (llama/16412)
2025-10-12 Aclyggml : fix graph reallocation with multiple chunks...
2025-10-12 Jeff Bolzvulkan: Replace uses of maxMemoryAllocationSize and...
2025-10-12 Jeff Bolzvulkan: Fix FA coopmat1 invalid array indexing (llama...
2025-10-12 Jeff Bolzvulkan: in flash attention, bounds check against nem1...
2025-10-12 Reese Levineggml webgpu: add support for soft_max, optimize rms_nor...
2025-10-12 Piotr Wilkin... model : Apertus model implementation (llama/15852)
2025-10-12 R0CKSTARmusa: update compile flags (llama/16265)
2025-10-12 uvosHIP: Disable ROCWMMA fattn on CDNA when compiled agains...
2025-10-12 Evevulkan: make ggml_vk_default_dispatcher support older...
2025-10-12 lhezopencl: support pad_ext (llama/15888)
2025-10-12 Reese Levineggml webgpu: support for rope,div,sub,glu,scale,cont...
2025-10-12 lhezopencl: support ne3 in get_rows (llama/15866)
2025-09-30 Georgi Gerganovggml : bump version to 0.9.4 (#1363) upstream/0.9.4 v0.9.4
2025-09-30 Georgi Gerganovsync : whisper.cpp [no ci]
2025-09-30 Georgi Gerganovsync : llama.cpp
2025-09-30 anavp-nvidiacuda : Enable CUDA Graph usage for Nemotron Nano v2...
2025-09-30 Georgi Gerganovmetal : dynamic simdgroups for MV kernels (llama/16340)
2025-09-30 Charles Xukleidiai : fix work size and threads sync for fp16...
2025-09-30 Jeff Bolztests: override test_set_rows::max_nmse_err to allow...
2025-09-29 Georgi Gerganovsync : llama.cpp
2025-09-29 alex-spacemitggml: riscv: add riscv spacemit backend (llama/15288)
2025-09-29 Rafal Lewczukggml-backend : add root cause in error message if loadi...
2025-09-29 Georgi Gerganovsync : whisper.cpp (#1359)
2025-09-29 Georgi Gerganovci : print results [no ci] (#1358)
next