]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2026-04-01 uvosCUDA/HIP: Fix kernel slection for mmvq mmid kernel...
2026-04-01 Georgi Gerganovggml : fix RWKV ops thread assignment (llama/21226)
2026-04-01 Taimur Ahmadggml-cpu: fix fallback for RVV kernels without zvfh...
2026-04-01 Anav PrasadCUDA: Add Flash Attention Support for Head Dimension...
2026-04-01 Reese Levineggml webgpu: quantized buffers to u32 + wider browser...
2026-04-01 Georgi Gerganovsync : llama.cpp
2026-04-01 Abhijit Rameshggml-webgpu: port all AOT operators to JIT (llama/20728)
2026-04-01 Georgi Gerganovsync : llama.cpp
2026-04-01 hipuddingCANN: fix multi-thread set_tensor race conditions ...
2026-04-01 Neo Zhangsycl : enhance fattn perf (llama/21185)
2026-04-01 shaofeiqiopencl: add q4_K gemm and gemv kernels for Adreno ...
2026-04-01 Oliver SimonsCUDA : Fix CUB's argsort when nrows % block_size =...
2026-04-01 Radoslav Gerganovrpc : fix misleading error log (llama/21184)
2026-04-01 Gaurav GargOptimize MOE GEMV kernel for BS > 1. (llama/20905)
2026-04-01 Max Krasnyanskyhexagon: dma optimizations (mostly fixing regressions...
2026-03-30 Georgi Gerganovggml : bump version to 0.9.9 (#1449) v0.9.9
2026-03-30 Georgi Gerganovsync : whisper.cpp
2026-03-28 Georgi Gerganovsync : llama.cpp
2026-03-28 Ruben Ortlamvulkan: add noncontiguous GLU support (llama/21081)
2026-03-28 Yiwei Shaohexagon: support for IQ4_NL and MXFP4 (llama/21018)
2026-03-28 Radoslav Gerganovrpc : proper handling of data pointers to CPU buffers...
2026-03-28 renmetal : Fix dimension constraint violation in matmul2d...
2026-03-28 uvoship: use fnuz fp8 for conversion on CDNA3 (llama/21040)
2026-03-28 lhezopencl: allow large buffer for adreno (llama/20997)
2026-03-28 ihb2032fix(ggml): correct RISC-V ISA string canonical ordering...
2026-03-28 Michael Wandggml-cuda: Add NVFP4 dp4a kernel (llama/20644)
2026-03-28 Yihao WangCUDA & CPU: support F32 kernel type for `CONV_TRANSPOSE...
2026-03-28 Saba Fallahmtmd: Add DeepSeekOCR Support (llama/17400)
2026-03-28 Johannes Gäßlerllama: fix llama-model-saver (llama/20503)
2026-03-28 Neo Zhangsycl : fix wrong variable check by assert (llama/20903)
2026-03-28 nurimetal : add FLOOR, CEIL, ROUND, TRUNC unary ops (llama...
2026-03-28 Georgi Gerganovmetal : add FA instantiations for HSK=512, HSV=512...
2026-03-28 Max Krasnyanskyhexagon: general DMA and Binary Op fixes for large...
2026-03-28 lhezopencl: add q6_K gemm and gemv kernels for Adreno ...
2026-03-28 las7rpc : RCE patch (llama/20908)
2026-03-28 Rashid Ul Islammetal: add CONV_3D (llama/19927)
2026-03-28 Chenguang LiCANN: add RoPE cache preload before ACL graph capture...
2026-03-28 Dan Hoffmanfix(openvino): explicit memset in buffer_context alloca...
2026-03-28 shaofeiqiopencl: add flattened Q4_K mv and general Q4_K mm ...
2026-03-28 Johannes GäßlerCUDA: fix BF16 FA compilation (llama/20865)
2026-03-28 Neo Zhangsupport bf16 and quantized type (llama/20803)
2026-03-28 Patrick Buckleyggml-cuda: native bf16 flash attention for vec kernel...
2026-03-28 Gaurav GargIncrease number of output elements per-thread block...
2026-03-28 y198fix(rpc): prevent division by zero in deserialize_tenso...
2026-03-28 Matt CoralloAdd shader count for Intel Arc Pro B60 (llama/20818)
2026-03-28 shalinib-ibmggml-cpu: add always_inline to tinyBLAS_PPC accumulator...
2026-03-28 Jeff Bolzvulkan: change gated_delta_net to shard a column across...
2026-03-28 hipuddingCANN: add BF16 support for core operators (llama/20152)
2026-03-28 Sundaram krishnanggml: guard KleidiAI DOWNLOAD_EXTRACT_TIMESTAMP for...
2026-03-28 Rail Chabdarovhip: Avoid compiler bug in RDNA code generation during...
2026-03-28 Yiwei Shaohexagon: add Matrix Extensions (HMX) for Hexagon NPU...
2026-03-28 uvosci : add hip quality check (llama/20430)
2026-03-28 Reese Levineggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE...
2026-03-28 Evevulkan: dequantize iq4_xs 4 at a time (llama/20657)
2026-03-28 Charles Xucmake : fix build warning when kleidiai is enabled...
2026-03-28 Chenguang LiCANN: handle in-place ROPE on non-contiguous f32 tensor...
2026-03-28 Georgi Gerganovsync : llama.cpp
2026-03-28 Masashi Yoshimuraggml-webgpu: Update the `RMS_NORM` preprocessor and...
2026-03-28 Georgi Gerganovsync : llama.cpp
2026-03-28 Masashi Yoshimuraggml-webgpu: Add supports for `DIAG` and `TRI` (llama...
2026-03-28 Chenguang LiCANN: support flash attention for head dim not multiple...
2026-03-28 Reese LevineMove to no timeout for WaitAny in graph submission...
2026-03-28 Shaw Nguyenggml-cpu/x86: fix unused changemask warning in repack...
2026-03-28 uvosHIP : ignore return of hipMemAdvise [no ci] (llama...
2026-03-28 Krishna Sridharhexagon: add neg, exp, sigmoid, softplus ops, cont...
2026-03-28 Ruben Ortlamvulkan: disable mmvq on Intel Windows driver (llama...
2026-03-28 Kevin Hannonggml-blas: set mkl threads from thread context (llama...
2026-03-28 Taimur Ahmadggml-cpu: fix RVV checks in quants and repacking (llama...
2026-03-28 Ruben Ortlamvulkan: async and event fixes (llama/20518)
2026-03-28 Justin Bradfordkleidiai : fix MUL_MAT support for batched (3D) inputs...
2026-03-28 Ruben Ortlamvulkan: allow graphics queue only through env var ...
2026-03-28 Neo Zhangehance UPSCALE to support all UT cases (llama/20637)
2026-03-28 Martin Klacerkleidiai: add data type check to get_tensor_traits...
2026-03-28 Ruben Ortlamvulkan: fix flash attention dot product precision ...
2026-03-28 Aman GuptaCUDA: GDN hide memory latency (llama/20537)
2026-03-28 Sigbjørn Skjæretsycl : fix for untransposed GDA recurrent state (llama...
2026-03-16 Georgi Gerganovci : disable AMX jobs
2026-03-16 Georgi Gerganovggml : bump version to 0.9.8 (#1442) v0.9.8
2026-03-16 Georgi Gerganovggml : restore ggml_type_sizef() to aboid major version...
2026-03-16 Georgi Gerganovreadme : simplify
2026-03-16 Georgi Gerganovsync : whisper.cpp
2026-03-16 Georgi Gerganovggml : try fix arm build (whisper/0)
2026-03-15 David366AIggml : extend im2col f16 (#1434)
2026-03-15 Georgi Gerganovcommon : add nvfp4 (#0)
2026-03-15 Georgi Gerganovsync : llama.cpp
2026-03-15 Johannes GäßlerCUDA: limit number of FA stream-k CUDA blocks (llama...
2026-03-15 Pascalggml: avoid creating CUDA context during device init...
2026-03-15 MoonShadowggml/hip: fix APU compatibility - soft error handling...
2026-03-15 Bartowskiggml : guard against sumq2 being 0 in IQ4_NL (llama...
2026-03-15 PikaPikachucuda : add RDNA4-specific MMVQ parameter table for...
2026-03-15 Ruben Ortlamvulkan: use graphics queue on AMD (llama/20551)
2026-03-15 Georgi Gerganovmetal : add FA specialization for HSK = 320, HSV =...
2026-03-15 Max Krasnyanskyhexagon: Q4_0 and MXFP4 repack fixes (llama/20527)
2026-03-15 Neo Zhangadd op gated_delta_net (llama/20455)
2026-03-15 Adrien Gallouëtggml : add native AVX512-FP16 support for F16 operation...
2026-03-15 WallentriUse fp32 in cuBLAS V100 to avoid overflows, env variabl...
2026-03-15 Zijun Yuggml : add OpenVINO backend (llama/15307)
2026-03-15 Rail ChabdarovFix data race in CUDA's "cpy" kernel (influences GGML...
2026-03-15 lhezopencl: fix l2_norm (llama/20480)
2026-03-15 Georgi Gerganovgraph : remove redundant GDN state transposes (llama...
next