]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/shortlog
pkg/ggml/sources/whisper.cpp
2026-01-30 Aman Guptaggml-cpu: Use tiled FA for prompt-processing (llama...
2026-01-30 Georgi Gerganovkv-cache : support V-less cache (llama/19067)
2026-01-30 Johannes GäßlerCUDA: re-use MLA K data for V in MMA FA (llama/19057)
2026-01-30 Aman Guptaggml-cuda: enable cuda-graphs for `n-cpu-moe` (llama...
2026-01-30 nullnameggml-hexagon: flash-attn opt (llama/19025)
2026-01-30 Neo Zhanguse malloc to support both iGPU and dGPU in same time...
2026-01-30 Alberto Cabrera... ggml-cpu: aarm64: q5_K repack gemm and gemv (and generi...
2026-01-30 Georgi Gerganovmla : make the V tensor a view of K (llama/18986)
2026-01-30 Johannes GäßlerCUDA: fix alignment check for FA (llama/19023)
2026-01-30 lhezopencl: enable the general fp mm for non-cont input...
2026-01-30 Aman GuptaCUDA: add gqa_ratio 4 for GLM 4.7 flash (llama/18953)
2026-01-30 shaofeiqiopencl: add TRI op support (llama/18979)
2026-01-30 Aleksei Nikiforovggml-zdnn : mark zDNN buffers as non-host (llama/18967)
2026-01-30 Jeff Bolzvulkan: Remove transfer_ctx, do everything in compute_c...
2026-01-30 Jeff Bolzvulkan: support flash attention GQA/split_k with small...
2026-01-30 Masato NakasakaRevert "vulkan: force full subgroups for flash attentio...
2026-01-30 Jeff Bolzvulkan: Use mul_mat_vec_id for small values of n (llama...
2026-01-30 Oliver SimonsCUDA: Fix builds for older CCCL versions by ifdefing...
2026-01-30 Oliver SimonsCUDA: Replace init_offsets kernel with iterators in...
2026-01-30 Adrien Gallouëtggml : cleanup path_str() (llama/18928)
2026-01-30 Georgi Gerganovmetal : enable FA for MLA heads (llama/18950)
2026-01-30 Georgi Gerganovggml : add ggml_build_forward_select (llama/18550)
2026-01-30 lhezopencl: fix q6_K mv for m=1 (llama/18893)
2026-01-30 Reese Levineggml webgpu: support for backend sampling (llama/18880)
2026-01-30 Thore Koritziusggml : extend ggml_pool_1d + metal (llama/16429)
2026-01-30 Perry Naseckggml-blas: hide warnings from included BLAS headers...
2026-01-30 Raul TorresCANN: Remove unused `ggml_cann_get_device` function...
2026-01-30 Chenguang LiCANN: fix an issue where get_env was not fully renamed...
2026-01-30 hipuddingCANN: support gated linear attn (llama/18653)
2026-01-30 shaofeiqiOpenCL: add SOLVE_TRI op support (llama/18846)
2026-01-30 Georgi Gerganovcuda : print less debug logs when disabling cuda graphs...
2026-01-30 Johannes GäßlerCUDA: fix allignment on register spill for FA (llama...
2026-01-30 shalinib-ibmggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (llama...
2026-01-30 Max Krasnyanskyhexagon: support for OP_CPY, host buffers now optional...
2026-01-30 Oliver SimonsCUDA: Factor out and re-use `block_reduce` function...
2026-01-30 Jeff Bolzvulkan: Check maxStorageBufferRange in supports_op...
2026-01-30 Daniel BeveniusCUDA : fix typo in clang pragma comment [no ci] (llama...
2026-01-30 Ruben Ortlamvulkan: work around Intel fp16 bug in mmq (llama/18814)
2026-01-30 Perry Naseckggml-metal: do not copy headers for embedded, use curre...
2026-01-30 yuloHIP: add fattn-mma-f16 for RDNA4 (llama/18481)
2026-01-21 Bráulio Oliveiraexamples : use -dev/--device and WHISPER_ARG_DEVICE...
2026-01-16 Yshtolawhisper : Fix UTF-8 character boundary issue in segment...
2026-01-15 Georgi Gerganovrelease : v1.8.3 upstream/1.8.3
2026-01-15 Georgi Gerganovbenches : update
2026-01-14 Georgi Gerganovsync : ggml
2026-01-14 Georgi GerganovCUDA : fix unused argument when USE_CUDA_GRAPH=OFF...
2026-01-14 Jeff Bolzvulkan: change memory_logger to be controlled by an...
2026-01-14 Jeff Bolzvulkan: Use VK_EXT_shader_64bit_indexing to handle...
2026-01-14 Ruben Ortlamvulkan: Disable large coopmat matmul configuration...
2026-01-14 Ruben OrtlamVulkan: Optimize Matmul parameters for AMD GPUs with...
2026-01-14 Georgi Gerganovtalk-llama : sync llama.cpp
2026-01-14 Georgi Gerganovsync : ggml
2026-01-14 shaofeiqiopencl: add SOFTPLUS op support (llama/18726)
2026-01-14 Johannes GäßlerHIP: adjust RDNA3.5 MMQ kernel selction logic (llama...
2026-01-14 Perry Naseckcmake : update blas logic (llama/18205)
2026-01-14 Michael WandCorrected: changed s13 = src1->nb[3] instead of nb...
2026-01-14 shaofeiqiopencl: add EXPM1 op (llama/18704)
2026-01-14 Reese LevineUpdates to webgpu get_memory (llama/18707)
2026-01-14 Aaron Teollama: use host memory if device reports 0 memory ...
2026-01-14 Masashi Yoshimuraggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten...
2026-01-14 Reese Levineggml webgpu: initial flashattention implementation...
2026-01-14 Jeff Bolzvulkan: fix push constant size for quantize_q8_1 (llama...
2026-01-14 Jeff Bolzvulkan: optimize ssm_scan (llama/18630)
2026-01-14 도로로도로또metal : add MoE kernel specialization for ne20=5 (llama...
2026-01-14 Doctor Shotgunggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (llama...
2026-01-14 shaofeiqiopencl: add FILL op support (llama/18682)
2026-01-14 Oliver Walshcuda : fix build on cuda 12.8 (llama/18672)
2026-01-14 Jeff Bolzvulkan: reject ops when a tensor is too large to alloca...
2026-01-14 virajwadvulkan: Warptile tuning for Intel Xe2/Xe3 (llama/18178)
2026-01-14 Evevulkan: more mul mat optimizations (llama/18533)
2026-01-14 hipuddingCANN: Fix rename for get_env (llama/18652)
2026-01-14 Raul TorresCANN: Rename `get_env` to `get_env_as_lowercase` (llama...
2026-01-14 Max KrasnyanskyHexagon add support for f16/f32 flash attention, scale...
2026-01-14 Aadeshveer... ggml : optimize cuda ssm_scan using warp-level reductio...
2026-01-14 Jeff Bolzvulkan: support buffer_from_host_ptr (llama/18467)
2026-01-14 Aman Guptaggml-cuda: refactor cuda graph usage (llama/18637)
2026-01-14 Beinseziimmq.cu: tune mmq/rocblas switching for RDNA (llama...
2026-01-14 Adrien Gallouëtggml : fix avx512bf16 build (llama/18623)
2026-01-14 Raul TorresCANN: Make `valid_values` variable `static const` ...
2026-01-14 nwyinggml webgpu: add CEIL operation support (llama/18605)
2026-01-14 Johannes GäßlerCUDA: fix FA FP16 accumulator overflow for Granite...
2026-01-14 Aman Guptaggml-cuda: check for srcs outside the cgraph (llama...
2026-01-14 Jeff Bolzvulkan: fix topk_moe_sigmoid_norm_bias failures in...
2026-01-14 Jeff Bolzvulkan: handle quantize_q8_1 overflowing the max workgr...
2026-01-14 Chenguang LiCANN: add operator fusion support for ADD + RMS_NORM...
2026-01-14 Daniel Beveniussampling : add support for backend sampling (llama...
2026-01-14 Aman GuptaCUDA: disable cuda graph when using n-cpu-moe (llama...
2026-01-14 Aman Guptaggml-cuda: remove unused params in ggml_cuda_graph...
2026-01-14 Aman Guptaggml-cuda: fixes for concurrent streams (llama/18496)
2026-01-14 Johannes GäßlerCUDA: only allocate FA tmp buffer if needed (llama...
2026-01-14 pl752(Bugfix, ggml-cuda) Pool alloc count fix + small size...
2026-01-14 Shouyuggml-hexagon: optimize activation function (llama/18393)
2026-01-14 Jeff Bolzvulkan: Optimize GGML_OP_CUMSUM (llama/18417)
2026-01-14 Jeff Bolzvulkan: Implement mmvq for iq1_s/iq1_m (llama/18450)
2026-01-14 Georgi Gerganovmetal : adjust extra size for FA buffer to avoid reallo...
2026-01-14 Chris Rohlfrpc : use unordered_map::reserve and emplace (llama...
2026-01-14 MeeMincuda : fix copy of large tensors (ggml_nbytes <= INT_MA...
2026-01-14 Aman Guptaggml-cuda: remove unneccesary prints on ggml_cuda_init...
2026-01-14 Jeff Bolzvulkan: extend topk_moe to handle sigmoid w/exp_probs_b...
2026-01-13 Peter A.examples : fix executable example targets (#3600)
next