]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2026-02-15 Georgi Gerganovsync : whisper.cpp
2026-02-14 Georgi Gerganovsync : llama.cpp
2026-02-14 Georgi Gerganovmodels : optimize qwen3next graph (llama/19375)
2026-02-14 Adrien Gallouëtggml : fix GGML_DEBUG with OpenMP (llama/19599)
2026-02-14 Georgi Gerganovmetal : fix ACC op (llama/19427)
2026-02-14 Jeff Bolzvulkan: support L2_NORM with contiguous rows (llama...
2026-02-14 Jeff Bolzvulkan: support GGML_OP_SET (llama/19584)
2026-02-14 Sophonvulkan: Add vendor id for Qualcomm drivers (llama/19569)
2026-02-14 Max Krasnyanskyhexagon: further optimizations and refactoring for...
2026-02-14 Jeff Bolzvulkan: restore -inf check in FA shaders (llama/19582)
2026-02-14 Alberto Cabrera... Fix wrong memcpy length for block_interleave == 4 ...
2026-02-14 ymckifix vulkan ggml_acc only works in 3d but not 4d (llama...
2026-02-14 Aman GuptaCUDA: loop over ne2*ne3 in case it overflows (llama...
2026-02-14 Oliver SimonsCUDA: Do not mutate cgraph for fused ADDs (llama/19566)
2026-02-14 Georgi Gerganovmetal : improve concurrency (llama/19555)
2026-02-14 Georgi Gerganovmetal : support GGML_OP_SET (llama/19548)
2026-02-14 Shupei Fanhexagon: fix typo in vtcm_needs_release (llama/19545)
2026-02-14 lhezopencl: add basic support for q4_1 (llama/19534)
2026-02-14 Georgi Gerganovmetal : update sum_rows kernel to support float4 (llama...
2026-02-14 Mario LimoncielloAdd a workaround for compilation with ROCWMMA_FATTN...
2026-02-14 Max Krasnyanskyhexagon: further optimization and tuning of matmul...
2026-02-14 lhezopencl: add general Q6_K mm and Q4_K mv (llama/19347)
2026-02-14 Georgi Gerganovggml : unary ops support non-cont src0 + metal F16...
2026-02-14 Georgi Gerganovmetal : extend l2_norm support for non-cont src0 (llama...
2026-02-14 Max Krasnyanskyhexagon: Add ARGSORT, DIV, SQR, SQRT, SUM_ROWS, GEGLU...
2026-02-14 Georgi Gerganovggml : extend bin bcast for permuted src1 (llama/19484)
2026-02-14 Georgi Gerganovmetal : consolidate unary ops (llama/19490)
2026-02-14 Oliver SimonsCUDA : Update CCCL-tag for 3.2 to final release from...
2026-02-14 Nikhil JainPlug memory leaks and free resources on shutdown (llama...
2026-02-14 Xuan-Son Nguyentest: fix IMROPE perf test case (llama/19465)
2026-02-14 Alberto Cabrera... ggml-cpu: arm64: q6_K repack gemm and gemv (and generic...
2026-02-14 k4ss4nggml : use noexcept overload for is_regular_file in...
2026-02-14 Raul TorresCANN: Remove unnecessary wrapper for `gml_backend_buft_...
2026-02-14 hipuddingCANN: implement quantized MUL_MAT_ID for MoE models...
2026-02-14 Georgi Gerganovcuda : extend GGML_OP_PAD to work with non-cont src0...
2026-02-14 Oliver SimonsCUDA: Fix non-contig rope (llama/19338)
2026-02-07 Georgi Gerganovsync : llama.cpp
2026-02-07 Georgi Gerganovmetal : consolidate bin kernels (llama/19390)
2026-02-07 Georgi Gerganovmetal : fix event synchronization in cpy_tensor_async...
2026-02-07 Abhijit Rameshggml-webgpu: JIT compile binary operators and handle...
2026-02-07 Georgi Gerganovsync : llama.cpp
2026-02-07 Nechama Krashinskisycl: add F16 support for GGML_OP_CEIL (llama/19306)
2026-02-07 Jeff Bolztests: reduce number of FA test permutations (llama...
2026-02-07 Jeff Bolzvulkan: For coopmat2 FA, use fp16 accumulators for...
2026-02-07 Jeff Bolzvulkan: make FA mask/softcap enables spec constants...
2026-02-07 Georgi Gerganovmetal : skip loading all-zero mask (llama/19337)
2026-02-07 Georgi Gerganovcuda : cuda graphs now compare all node params (llama...
2026-02-07 Georgi Gerganovmetal : adaptive CPU/GPU interleave based on number...
2026-02-07 Jeff Bolzvulkan: Preprocess FA mask to detect all-neg-inf and...
2026-02-07 Georgi Gerganovmetal : add diag (llama/19330)
2026-02-07 Oleksandr Kuvshynovvulkan: fix GPU deduplication logic. (llama/19222)
2026-02-07 Jeff Bolzvulkan: Set k_load_shmem to false when K is too large...
2026-02-07 Jeff Bolzvulkan: fix non-contig rope (llama/19299)
2026-02-07 will-lmsmetal : add missing includes (llama/19348)
2026-02-07 Georgi Gerganovtests : add non-cont, inplace rope tests (llama/19296)
2026-02-07 Kevin Pougetggml-virtgpu: make the code thread safe (llama/19204)
2026-02-07 Aman Guptaggml-cpu: use LUT for converting e8->f32 scales on...
2026-02-07 Georgi Gerganovmetal : add solve_tri (llama/19302)
2026-02-07 Ruben Ortlamvulkan: disable coopmat1 fa on Nvidia Turing (llama...
2026-02-07 Aman GuptaCUDA: use mmvq for mul-mat-id for small batch sizes...
2026-02-07 Georgi Gerganovmetal : minor cleanup (llama/19251)
2026-02-07 Oliver SimonsCUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_f...
2026-02-07 Georgeggml: added cleanups in ggml_quantize_free (llama/19278)
2026-02-07 Gaurav Gargcuda : revert CUDA_SCALE_LAUNCH_QUEUES override until...
2026-02-07 lhezopencl: refactor some ops, concat, repeat, tanh and...
2026-02-07 Aman Guptaggml-cpu: FA split across kv for faster TG (llama/19209)
2026-02-07 Neo ZhangRemove support for Nvidia & AMD GPU, because the oneAPI...
2026-02-07 Tamarsycl: implement GGML_OP_TOP_K (llama/19242)
2026-02-07 Georgi Gerganovmetal : support virtual devices (llama/18919)
2026-02-07 Johannes Gäßlerggml-backend: fix async set/get fallback sync (llama...
2026-02-07 Christian Kastnerdocs : Minor cleanups (llama/19252)
2026-02-07 Nikhil JainRemove pipeline cache mutexes (llama/19195)
2026-02-07 Max KrasnyanskyBump cmake max version (needed for Windows on Snapdrago...
2026-02-07 nullnameggml-hexagon: flash-attention and reduce-sum optimizati...
2026-02-07 shaofeiqiopencl: add optimized q8_0 mm kernel for adreno (llama...
2026-02-07 Simon RedmanCorrectly fetch q8_1 quantize pipeline in test as neede...
2026-02-07 Georgi Gerganovtests : add GQA=20 FA test (llama/19095)
2026-02-07 Georgi Gerganovci : remove "Release" word from the title of the release
2026-02-07 Georgi Gerganovggml : bump version to 0.9.6 (#1423) v0.9.6
2026-01-30 Georgi Gerganovcmake : remove unused file (#1419)
2026-01-30 Georgi Gerganovsync : whisper.cpp
2026-01-30 Georgi Gerganovcuda : fix compile warnings (whisper/0)
2026-01-30 Georgi Gerganovsync : llama.cpp
2026-01-30 bssrdfadd tensor type checking as part of cuda graph properti...
2026-01-30 s8322sycl: implement GGML_UNARY_OP_SOFTPLUS (llama/19114)
2026-01-30 RachelMantelsycl: implement GGML_OP_TRI (llama/19089)
2026-01-30 Zheyuan Chenggml-webgpu: improve flastAttention performance by...
2026-01-30 Todor Boinovskihexagon: enable offloading to Hexagon on Windows on...
2026-01-30 Georgi Gerganovcuda : fix nkvo, offload and cuda graph node properties...
2026-01-30 yuloHIP: add mmf for CDNA (llama/18896)
2026-01-30 Vishal Singhggml-zendnn : resolve ZenDNN backend cross-module symbo...
2026-01-30 Aman GuptaCUDA: refactor topk-moe to enable more models (GLM...
2026-01-30 Neo Zhangsycl: fix norm kernels: l2_norm, group_norm, rms_norm...
2026-01-30 Ruben OrtlamVulkan Flash Attention Coopmat1 Refactor (llama/19075)
2026-01-30 Patryk Kaminskiggml-sycl: remove unused syclcompat header (llama/19140)
2026-01-30 Oleksandr Kuvshynovvulkan: handle device dedup on MacOS + Vega II Duo...
2026-01-30 Kevin Pougetggml: new backend for Virglrenderer API Remoting accele...
2026-01-30 Alberto Cabrera... ggml-cpu: arm64: Q4_K scale unroll and vectorization...
2026-01-30 Georgi Gerganovcuda : fix "V is K view" check for non-unified KV cache...
2026-01-30 Georgi GerganovCUDA: tune GLM 4.7 Flash FA kernel selection logic...
next