]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/shortlog
pkg/ggml/sources/whisper.cpp
2025-09-20 Aman GuptaCUDA: some micro-optimizations in mmf.cuh for mul_mat_i...
2025-09-20 Georgi Gerganovmetal : remove memory pools (llama/15966)
2025-09-20 Ruben OrtlamVulkan: Clean up mul_mm shader (llama/15987)
2025-09-20 Georgi Gerganovmetal : fix kernel requirements (llama/15983)
2025-09-20 Aaron Teoggml-zdnn: rm user mapped buffers (llama/15965)
2025-09-20 Jeff Bolzvulkan: fix failing dequant shaders (llama/15862)
2025-09-20 Jeff Bolzvulkan: initialize vulkan-hpp to allow using extension...
2025-09-20 Georgi Gerganovmetal : refactor kernel loading (llama/15964)
2025-09-20 Georgi Gerganovmetal : allow ops to run concurrently (llama/15929)
2025-09-20 Georgi Gerganovmetal : fix memory leaks (llama/15962)
2025-09-20 Aaron Teoggml-zdnn: fix #15414, activate FP16 and BF16 accelerat...
2025-09-20 Ruben OrtlamVulkan iGPU device selection overhaul and PCI ID API...
2025-09-20 Mathieu Baudiervulkan: Make device memory check more portable (llama...
2025-09-20 Neo Zhang JianyuRevert "sycl: add usage of enqueue_functions extension...
2025-09-20 Diego Devesaggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device...
2025-09-20 Johannes GäßlerCUDA: larger SRAM reads for tile FA, AMD FP16 dot ...
2025-09-20 Daniel Beveniusggml-cpu : add check for ARM MATMUL_INT8/i8mm support...
2025-09-20 Charles Xukleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed...
2025-09-20 hipuddingCANN: Disable acl_graph for prefill stage (llama/15933)
2025-09-20 Oliver SimonsCUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3%...
2025-09-20 Daniel Beveniusggml-cpu : fix padding in ggml_timestep_embedding ...
2025-09-20 Georgi Gerganovsync : ggml
2025-09-20 Georgi Gerganovmetal : make the backend async (llama/15906)
2025-09-20 Georgi Gerganovsync : ggml
2025-09-20 Chenguang LiCANN: Add ROPE sin/cos cache for reuse (llama/15912)
2025-09-20 Chenguang LiCANN: implement LRU cache for ACL graphs (llama/15814)
2025-09-20 Ruben Ortlamvulkan: throw the oom error instead of no memory type...
2025-09-20 Jeff Bolzvulkan: Fix OOB accesses in soft_max_back (llama/15861)
2025-09-20 Johannes GäßlerHIP: use v_dot2_f32_f16 instruction for FA (llama/15884)
2025-09-20 lksj92hsWorkaround for subgroup arithmetic failing on MoltenVK...
2025-09-20 Aman GuptaCUDA: Add mul_mat_id support for the mmf kernel (llama...
2025-09-20 Johannes GäßlerCUDA: fix GET_ROWS for large tensors (llama/15882)
2025-09-20 Jeff Bolzvulkan: sort graph to allow more parallel execution...
2025-09-20 Aman GuptaCUDA: generate_cu_files.py - add missing mxfp4 (llama...
2025-09-20 Georgi Gerganovcuda : fix supports_op condition for get_rows when...
2025-09-20 Georgi Gerganovmetal : refactor + optimize (llama/15857)
2025-09-20 Xuan-Son Nguyenggml: allow casting between f32 and i32 (llama/15783)
2025-09-20 Sigbjørn SkjæretCUDA: non-contiguous src0 not supported for PAD (llama...
2025-09-20 Chenguang LiCANN: Stream sync between devices for acl_graph (llama...
2025-09-20 Jeff Bolzvulkan: support im2col_3d (llama/15795)
2025-09-20 Aaron Teoggml-cpu: clean up s390x SIMD (llama/15855)
2025-09-20 Jeff Bolzvulkan: Support pad_ext (llama/15794)
2025-09-20 Jeff Bolzvulkan: Use larger loads in scalar/coopmat1 matmul...
2025-09-20 Daniel Beveniusggml WebGPU: remove userdata from request adapter callb...
2025-09-20 Johannes GäßlerCUDA: faster tile FA (Pascal/AMD), headsize 256 (llama...
2025-09-20 Charles Xukleidiai: generalize compute_forward_kv_cache to comput...
2025-09-20 Johannes Gäßlerggml-cpu: document use of "free" memory [no ci] (llama...
2025-09-20 Aaron Teoggml-cpu: drop support for nnpa intrinsics (llama/15821)
2025-09-20 Johannes GäßlerCUDA: fastdiv, launch bounds for mmvq + q8_1 quant...
2025-09-20 Daniel Beveniusggml : introduce semantic versioning (ggml/1336)
2025-09-20 Gregor JasnyCUDA : conditionally add cuda architectures (ggml/1341)
2025-09-20 Gabe Goodhartmetal : Add template specialization for mul_mm_id w...
2025-09-20 Chenguang LiCANN: Refactor ND to NZ workspace to be per-device...
2025-09-20 leejetggml: add ops for WAN video model (cuda && cpu) (llama...
2025-09-20 hipuddingCANN: Fix precision issue on 310I DUO multi-devices...
2025-09-20 rmatifopencl: add hs=40 to FA (llama/15758)
2025-09-20 Chenguang LiCANN: fix acl_rstd allocation size in ggml_cann_rms_nor...
2025-09-20 Ruben Ortlamvulkan: fix mmv subgroup16 selection (llama/15775)
2025-09-20 Jeff Bolzvulkan: don't use std::string in load_shaders, to impro...
2025-09-20 Daniel Beveniusvulkan : update ggml_vk_instance_validation_ext_availab...
2025-09-20 Shin-myoung... ggml vulkan: add hardsigmoid and hardswish operations...
2025-09-20 Oliver SimonsCUDA: Optimize `rms_norm_f32` kernel and its fused...
2025-09-20 hipuddingCANN: Add RoPE contiguous check for 310I DUP device...
2025-09-20 xctanggml-cpu : optimize RVV kernels (llama/15720)
2025-09-20 hipuddingCANN: Mask unsupported TRANSPOSE_1D operator (llama...
2025-09-20 Chenguang LiCANN: Fix type float_t to float (llama/15736)
2025-09-20 Ruben Ortlamvulkan: fix shaders gen when no integer dot is availabl...
2025-09-20 hipuddingCANN: Resolve soft_max precision issue (llama/15730)
2025-09-20 Jeff Bolzvulkan: Fix macro parameter order for f32 matmul shader...
2025-09-20 rmatifopencl: add attn sinks support for FA kernels (llama...
2025-09-20 Chenguang LiCANN: Support eager execution mode under ACL graph...
2025-09-20 hipuddingCANN: Support ext_factor in rope (llama/15710)
2025-09-20 Johannes Gäßlerggml-backend: raise GGML_MAX_SPLIT_INPUTS (llama/15722)
2025-09-20 Gilad Svulkan: use memory budget extension to read memory...
2025-09-20 Jeff Bolzvulkan: add missing clamps in new mul_mat_id paths...
2025-09-20 Ruben Ortlamvulkan: disable large mmv subgroups on older Nvidia...
2025-09-20 s-goto-11ggml: SVE support for exponential functions (llama...
2025-09-20 Prashant Vithuleggml: aarch64: Implement SVE F16 kernels for vector...
2025-09-20 Ruben OrtlamVulkan: Add Integer Dot Product mul_mat_vec shader...
2025-09-20 Daniel Beveniusggml : WebGPU add TRANSPOSE and RESHAPE to supported...
2025-09-20 Akarshan BiswasCUDA: fix build error from ambiguous __half conversions...
2025-09-20 hipuddingCANN: Optimize MUL_MAT_ID (llama/15658)
2025-09-20 hipuddingCANN: fix RoPE cache issue on multi-device (llama/15629)
2025-09-20 Georgi Gerganovmetal : fix checks for available FA kernels (llama...
2025-09-20 Diego Devesallama : separate compute buffer reserve from fattn...
2025-09-20 Jeff Bolzvulkan: handle large sizes for get_rows (llama/15686)
2025-09-20 Jeff Bolzvulkan: mul_mat_id coopmat2 optimizations (llama/15546)
2025-09-20 Daniel Beveniusvulkan : remove unused portability_enumeration_ext...
2025-09-20 Jeff Bolzvulkan: Allow fallback to sysmem memory when vidmem...
2025-09-20 Jeff Bolzvulkan: clamp matmul and FA results to the max finite...
2025-09-20 Charles Xuggml: update kleidiai to v1.13.0 (llama/15663)
2025-09-20 Johannes Gäßlerllama: use FA + max. GPU layers by default (llama/15434)
2025-09-20 Johannes GäßlerCUDA: use FP32 arithmetic for conv2d (llama/15683)
2025-09-20 Jeff Bolzvulkan: Skip syncing for prealloc_y when it is reused...
2025-09-20 Chenguang LiCANN: FIx compiler warnings (llama/15661)
2025-09-20 Aman GuptaCUDA: fix bug in rms_norm fusion (llama/15660)
2025-09-20 Aman GuptaCUDA: fuse adds, fuse add with rms norm (llama/15631)
2025-09-20 mnehete32CUDA: add conv2d (llama/15635)
2025-09-20 Aaron Teoggml-cpu: fix invalid hsum build in debug s390x (llama...
2025-09-20 compiladeggml : fix SSM_SCAN for n_groups > 1 (llama/15625)
next