]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2025-09-25 hebangwenexamples : fix typo mismatch in gpt (#1349)
2025-09-25 Daniel Beveniusggml : bump version to 0.9.3 (#1353) v0.9.3
2025-09-25 Daniel Beveniusscripts : refactor release script into prepare and...
2025-09-25 Daniel Beveniusscripts : fix next dev version calculation [no ci]...
2025-09-25 Georgi Gerganovsync : llama.cpp
2025-09-25 Georgi Gerganovmetal : fuse NORM + MUL + ADD, support non-multiples...
2025-09-25 Georgi Gerganovmetal : relax reorder conditions (llama/16216)
2025-09-25 Georgi Gerganovmetal : restore im2col perf (llama/16219)
2025-09-25 Georgi Gerganovsync : llama.cpp
2025-09-25 Radoslav Gerganovrpc : use ggml logging facilities
2025-09-25 Eveci: run the x64 and arm ci on the github machines inste...
2025-09-25 Johannes Gäßlerllama: print memory breakdown on exit (llama/15860)
2025-09-25 Aclyggml : split graph allocations according to backend...
2025-09-25 Xiangyan Sunggml-cpu: Respect cpumask settings (llama/16164)
2025-09-25 Sigbjørn Skjæretggml : fix uninitialized is_on_grid in quantize_row_iq3...
2025-09-25 Aaron Teozdnn: refactor codebase + add docs (llama/16178)
2025-09-25 Daniel Beveniusggml-cpu : fix typo in gemm comments [no ci] (llama...
2025-09-25 Sigbjørn Skjæretggml : implement set_rows with i32 index (llama/16159)
2025-09-25 Georgi Gerganovggml : extend ggml_can_fuse to work with non-sequential...
2025-09-25 Georgi Gerganovggml : add ggml_op_is_empty (llama/16122)
2025-09-25 Shin-myoung... Vulkan: add conv_transpose_2d operation (llama/16022)
2025-09-25 Jeff Bolzvulkan: add RTE variants of exp shader (llama/16165)
2025-09-25 Ruben Ortlamvulkan: vec dot matrix multiplication fix (llama/16151)
2025-09-25 lhezopencl: fix concat crash on win arm64 with Adreno ...
2025-09-25 lhezopencl: initial `q8_0` mv support (llama/15732)
2025-09-25 Giuseppe Scrivanovulkan: optimize UMA buffer operations and fix driver...
2025-09-25 Jeff Bolzvulkan: fix validation error about VK_PIPELINE_CREATE_C...
2025-09-20 Georgi Gerganovggml : prepare for development of 0.9.2-dev
2025-09-20 Georgi Gerganovggml : bump version to 0.9.1
2025-09-20 Georgi Gerganovscripts : fix sed usage to work on Mac (#1345)
2025-09-20 Georgi Gerganovtests : adjust to new timestep_embedding operator
2025-09-20 Georgi Gerganovsync : llama.cpp
2025-09-20 Ruben Ortlamvulkan: use vec dot for matrix matrix multiplications...
2025-09-20 Xuan-Son Nguyenggml : refactor forward_dup for cpu backend (llama...
2025-09-20 Adrien Gallouëtggml-amx : fix ggml_amx_init() on generic Linux (llama...
2025-09-20 Adrien Gallouëtcmake : fix static linking for OpenMP on Unix-like...
2025-09-20 Shawn Guopencl: optimize mxfp4 kernels (llama/16037)
2025-09-20 Jeff Bolzrename optimize_graph to graph_optimize (llama/16082)
2025-09-20 Bowen HanCUDA: Optimize PAD_REFLECT_1D (llama/15957)
2025-09-20 Johannes GäßlerCUDA: fix compilation on CC 6.0 (llama/16091)
2025-09-20 Georgi Gerganovmetal : use function constants for mul_mv_ext kernels...
2025-09-20 Sigbjørn Skjæretcuda : add missing F32<->I32 entries in ggml_cuda_cpy_f...
2025-09-20 Georgi Gerganovmetal : improve F32, F16 and BF16 mat-vec multiplicatio...
2025-09-20 Jhen-Jie Hongmetal : avoid call free for non-owned buffer (llama...
2025-09-20 Georgi Gerganovmetal : handle nil cv during pipeline creation (llama...
2025-09-20 Chenguang LiCANN: Remove print (llama/16044)
2025-09-20 Reese LevineGGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS...
2025-09-20 Georgi Gerganovmetal : refactor + optimize v2 (llama/15995)
2025-09-20 Georgi Gerganovsync : llama.cpp
2025-09-20 Johannes GäßlerCUDA: fix FA occupancy, optimize tile kernel (llama...
2025-09-20 Evevulkan: automatically remove unsupported devices (llama...
2025-09-20 Chenguang LiCANN: Optimize ggml_cann_set_device (llama/15935)
2025-09-20 Daniel Beveniusggml : fix padding in timestep embedding kernels (llama...
2025-09-20 Jake KarnesCUDA: fix im2col_3d to respect non-contiguous inputs...
2025-09-20 yael-worksSYCL: Add COUNT_EQUAL operator support (llama/15991)
2025-09-20 Aman GuptaCUDA: some micro-optimizations in mmf.cuh for mul_mat_i...
2025-09-20 Georgi Gerganovmetal : remove memory pools (llama/15966)
2025-09-20 Ruben OrtlamVulkan: Clean up mul_mm shader (llama/15987)
2025-09-20 Georgi Gerganovmetal : fix kernel requirements (llama/15983)
2025-09-20 Aaron Teoggml-zdnn: rm user mapped buffers (llama/15965)
2025-09-20 Jeff Bolzvulkan: fix failing dequant shaders (llama/15862)
2025-09-20 Jeff Bolzvulkan: initialize vulkan-hpp to allow using extension...
2025-09-20 Georgi Gerganovmetal : refactor kernel loading (llama/15964)
2025-09-20 Georgi Gerganovmetal : allow ops to run concurrently (llama/15929)
2025-09-20 Georgi Gerganovmetal : fix memory leaks (llama/15962)
2025-09-20 Aaron Teoggml-zdnn: fix #15414, activate FP16 and BF16 accelerat...
2025-09-20 Ruben OrtlamVulkan iGPU device selection overhaul and PCI ID API...
2025-09-20 Mathieu Baudiervulkan: Make device memory check more portable (llama...
2025-09-20 Neo Zhang JianyuRevert "sycl: add usage of enqueue_functions extension...
2025-09-20 Diego Devesaggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device...
2025-09-20 Johannes GäßlerCUDA: larger SRAM reads for tile FA, AMD FP16 dot ...
2025-09-20 Daniel Beveniusggml-cpu : add check for ARM MATMUL_INT8/i8mm support...
2025-09-20 Charles Xukleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed...
2025-09-20 hipuddingCANN: Disable acl_graph for prefill stage (llama/15933)
2025-09-20 Oliver SimonsCUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3%...
2025-09-20 Daniel Beveniusggml-cpu : fix padding in ggml_timestep_embedding ...
2025-09-20 Georgi Gerganovsync : llama.cpp
2025-09-20 Georgi Gerganovmetal : make the backend async (llama/15906)
2025-09-20 Georgi Gerganovsync : llama.cpp
2025-09-20 Daniel Beveniustests : filter out no-ops from coverage report (llama...
2025-09-20 Chenguang LiCANN: Add ROPE sin/cos cache for reuse (llama/15912)
2025-09-20 Chenguang LiCANN: implement LRU cache for ACL graphs (llama/15814)
2025-09-20 Ruben Ortlamvulkan: throw the oom error instead of no memory type...
2025-09-20 Jeff Bolzvulkan: Fix OOB accesses in soft_max_back (llama/15861)
2025-09-20 Johannes GäßlerHIP: use v_dot2_f32_f16 instruction for FA (llama/15884)
2025-09-20 lksj92hsWorkaround for subgroup arithmetic failing on MoltenVK...
2025-09-20 Aman GuptaCUDA: Add mul_mat_id support for the mmf kernel (llama...
2025-09-20 Johannes GäßlerCUDA: fix GET_ROWS for large tensors (llama/15882)
2025-09-20 Jeff Bolzvulkan: sort graph to allow more parallel execution...
2025-09-20 Aman GuptaCUDA: generate_cu_files.py - add missing mxfp4 (llama...
2025-09-20 Georgi Gerganovcuda : fix supports_op condition for get_rows when...
2025-09-20 Georgi Gerganovsync : llama.cpp
2025-09-20 Georgi Gerganovmetal : refactor + optimize (llama/15857)
2025-09-20 Georgi Gerganovsync : llama.cpp
2025-09-20 Xuan-Son Nguyenggml: allow casting between f32 and i32 (llama/15783)
2025-09-20 Sigbjørn SkjæretCUDA: non-contiguous src0 not supported for PAD (llama...
2025-09-20 Jeff Bolztests: large sizes for get_rows (llama/15687)
2025-09-20 Chenguang LiCANN: Stream sync between devices for acl_graph (llama...
2025-09-20 Jeff Bolzvulkan: support im2col_3d (llama/15795)
2025-09-20 Aaron Teoggml-cpu: clean up s390x SIMD (llama/15855)
next