]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2025-12-11 ixgbeggml: replace hwcap with riscv_hwprobe for RVV detectio...
2025-12-11 Ruben OrtlamVulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support...
2025-12-11 Jeff Bolzvulkan: improve topk perf for large k, fix overflow...
2025-12-11 Diego Devesaggml : add GGML_SCHED_NO_REALLOC option to disable...
2025-12-11 R0CKSTARenable fp16/fast_fp16/bf16_mma on PH1 (llama/17551)
2025-12-11 Aman Guptaggml-cuda: add stricter checking for fusion (llama...
2025-12-11 Piotr Wilkin... model : Qwen3 Next (llama/16095)
2025-12-11 Johannes GäßlerCUDA: no FP16 arithmetic for vector FA kernel (llama...
2025-12-11 Jeff Bolzvulkan: Implement GGML_OP_TRI (llama/17503)
2025-12-11 Radoslav Gerganovrpc : cache and reuse compute graphs (llama/15405)
2025-12-11 yuloHIP: enable mul_mat_f for RDNA4 (llama/17437)
2025-12-11 Piotr Wilkin... SOLVE_TRI CUDA kernel for small matrices (llama/17457)
2025-12-11 Neo Zhang Jianyurefactor pad_reflect_1d to make the UT case pass (llama...
2025-12-11 Jeff Bolzvulkan: Implement SOLVE_TRI (llama/17486)
2025-12-11 matt23654cuda : fix UMA detection on discrete GPUs. (llama/17537)
2025-12-11 Alberto Cabrera... ggml-cpu: aarm64: q4_K repack gemm and gemv implementat...
2025-12-11 Aclyvulkan : move contiguous checks to device_supports_op...
2025-12-11 Jeff Bolzvulkan: use a fixed 1KB buffer for the add_rms_fusion...
2025-12-11 lhezopencl: add sqr, sqrt, mean and ssm_conv (llama/17476)
2025-12-11 Alberto Cabrera... Fix chunks being too small with small matrix sizes...
2025-12-11 Jeff Bolzvulkan: allow graph_optimize for prompt processing...
2025-12-11 Jeff Bolzvulkan: Implement top-k (llama/17418)
2025-12-11 xctanggml-cpu : add RISC-V Zvfh impl for ggml_vec_mad_f16...
2025-12-11 Adrien Gallouëtggml : fix ARM feature verification (llama/17519)
2025-12-11 Jiacheng (Jason... HIP: Patch failed testcase in WMMA-MMQ kernels for...
2025-12-11 hipuddingCANN: Add MROPE and IMROPE support (llama/17401)
2025-12-11 Jeff Bolzvulkan: Implement GGML_OP_CUMSUM (llama/17479)
2025-12-11 Georgi Gerganovggml : add ggml_top_k (llama/17365)
2025-12-11 TianHao324CANN: supports out_prod operator for F32 and F16 (llama...
2025-12-11 Jeff Bolzvulkan: Use fewer rows for scalar FA when HS is not...
2025-12-11 Jeff Bolzvulkan: more FA details in vk_perf_logger (llama/17443)
2025-12-11 Jiacheng (Jason... HIP: WMMA-MMQ kernels for RDNA 4 (llama/17156)
2025-12-11 Alberto Cabrera... ggml-cpu: arm64: q4_K repack gemm and gemv implementati...
2025-12-11 ixgbeggml: add RISC-V cpu-feats (llama/17461)
2025-12-11 Max Krasnyanskyhexagon: add support for ROPE_NEOX (llama/17458)
2025-12-11 Raul TorresCANN: Define `cann_graph_update_required` before macro...
2025-12-11 M. Mediouniggml-hexagon: Initial Hexagon v68/v69 support (llama...
2025-12-11 nullnameggml-hexagon: add `hex_supported_buffer` for better...
2025-12-11 Sigbjørn Skjæretcuda : support non-contiguous i32 to i32 copy (llama...
2025-12-11 Jeff Bolzvulkan: remove a couple unnecessary switches (llama...
2025-12-11 Masato NakasakaRevive MUL_MAT_ID to perf testing (llama/17397)
2025-12-11 yuloHIP: RDNA4 tensor core support for MMF (llama/17077)
2025-12-11 lhezopencl: refine condition for kqv mm (llama/17392)
2025-12-11 Jeff Bolzvulkan: disable async for older Intel devices (llama...
2025-12-11 Raul TorresCANN: Refactor `evaluate_and_capture_cann_graph` (llama...
2025-12-11 nullnameggml-hexagon: fix swiglu failure at `test-backend-ops...
2025-12-11 Piotr Wilkin... ggml : Fix transposed SOLVE_TRI result (llama/17323)
2025-12-11 Scott FudallyDGX Spark: UMA support (llama/17368)
2025-12-11 Adrien Gallouëtggml : remove useless and error-prone variadic macros...
2025-12-11 sudhiarmkleidiai: fix zero-size array declaration (llama/17240)
2025-12-11 ixgbeggml-cpu:add RISC-V RVV (Zvfh) optimization for FP16...
2025-12-11 Giuseppe Scrivanovulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP...
2025-12-11 Jeff Bolzvulkan: support larger argsort (llama/17313)
2025-12-11 Jeff Bolzvulkan: Add copy_transpose shader (llama/17371)
2025-12-11 Aman Guptacuda: fix rope fusion for gemma3 (llama/17378)
2025-12-11 Piotr Wilkin... Fix too relaxed check on CUDA "fast copy" (can_be_trans...
2025-12-11 Ruben Ortlamvulkan: force full subgroups for flash attention to...
2025-12-11 Jeremy Randggml-cpu: Don't pass -mpowerpc64 when -mcpu already...
2025-12-11 Chenguang LiCANN: fix acl_tensor_ptr usage in ASCEND_310P ROPE...
2025-12-11 Jeff Bolzvulkan: support noncontig i32 copy (llama/17328)
2025-12-11 Ruben Ortlamvulkan: add log RTE support to fix Nvidia CI (llama...
2025-12-11 Adrien Gallouëtcmake : fix ARM feature verification (llama/17170)
2025-12-11 Adrien Gallouëtggml : add missing AVX512 feature checks (llama/17270)
2025-11-24 Daniel Beveniusggml : remove dirty flag from version string (#1391)
2025-11-20 Georgi Gerganovsync : whisper.cpp
2025-11-20 YangLemetal : fix compile on macos 11 (whisper/3533)
2025-11-17 Georgi Gerganovsync : llama.cpp
2025-11-17 Georgi Gerganovmetal : support I32 -> I32 copy (llama/17317)
2025-11-17 Georgi Gerganovmetal : faster argsort (llama/17315)
2025-11-17 Georgi Gerganovmetal : add cumsum (llama/17305)
2025-11-17 hipuddingCANN: Use smart pointers to manage ACL objects (llama...
2025-11-17 Pavels Zaicenkovsvulkan: add LOG operation support for F32 and F16 ...
2025-11-17 Ruben Ortlamvulkan: fix MMQ quantize_y condition (llama/17301)
2025-11-17 Georgi Gerganovmetal : remove obosolete asserts (llama/17295)
2025-11-17 lhezopencl: fix rms_norm_mul (llama/17250)
2025-11-17 shaofeiqiopencl: add kernel to handle mat mul in attention to...
2025-11-17 shani-fsycl : unify unary kernels with a generic implementatio...
2025-11-17 Jeff Bolzvulkan: Fuse mul_mat_id+add_id+mul and mul_mat+add...
2025-11-17 Ruben Ortlamvulkan: Replace 16-bit unpack8 calls to work around...
2025-11-17 Giuseppe Scrivanovulkan: implement ABS and NEG (llama/17245)
2025-11-17 Jeff Bolzvulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec...
2025-11-17 Jeff Bolzvulkan: skip all-negative-inf blocks in FA (llama/17186)
2025-11-17 Jeff Bolzvulkan: change graph_compute to be async and enable...
2025-11-17 Georgi Gerganovmetal : support argsort for ne00 > 1024 (llama/17247)
2025-11-17 Georgi Gerganovmetal : make the FA extra sizes consistent (llama/17143)
2025-11-17 Alberto Cabrera... ggml-cpu: handle 3d tensors in repack mat_mul (llama...
2025-11-17 Piotr Wilkin... ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM...
2025-11-17 Ruben Ortlamvulkan: remove shell call from vulkan-shaders-gen tool...
2025-11-17 Diego Devesasched : fix reserve ignoring user tensor assignments...
2025-11-17 ixgbeggml-cpu : add RISC-V vector intrinsic support for...
2025-11-17 bagheerametal: accelerated conv2d (llama/17175)
2025-11-17 Georgi GerganovRevert "ggml-cpu: handle 3d tensors in repack mat_mul...
2025-11-17 Diego Devesaggml-cpu : use template for argsort (llama/17222)
2025-11-17 TecJeshCANN: Add cross_entropy_loss op support (llama/16886)
2025-11-17 Aman GuptaCUDA: fuse rope + set_rows (llama/16884)
2025-11-17 Johannes GäßlerCUDA: static assert to prevent misuse of memcpy_1 ...
2025-11-17 Georgi Gerganovggml : use std::sort in ggml_argsort CPU implementation...
2025-11-17 Alberto Cabrera... ggml-cpu: handle 3d tensors in repack mat_mul (llama...
2025-11-17 TecJeshCANN: Add L2_NORM op support (llama/16856)
2025-11-17 Neo Zhang Jianyufix ci crash about SSM_CONV (llama/17169)
next