]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2026-03-15 shaofeiqiopencl: add cumsum op (llama/18981)
2026-03-15 uvoship: compile debug builds with -O2 on hip to avoid...
2026-03-15 Masashi Yoshimuraggml-webgpu: Add supports for `GGML_OP_REPEAT` (llama...
2026-03-15 Georgi Gerganovllama : enable chunked fused GDN path (llama/20340)
2026-03-15 Richard Davisonggml : add NVFP4 quantization type support (llama/19769)
2026-03-15 Daniel Beveniusllama : add support for Nemotron 3 Super (llama/20411)
2026-03-15 Georgi Gerganovmetal : fix capture_compute counter logic (llama/20410)
2026-03-15 Georgi Gerganovmetal : fix q5_k mul_mv register spill (llama/20399)
2026-03-15 Georgi Gerganovmetal : add env var to trigger graph capture (llama...
2026-03-15 uvosggml-cuda: gdn use shared mem for HIP (llama/20366)
2026-03-15 uvoscuda/hip: fix loop unrolling in ssm-conv (llama/20369)
2026-03-15 Neo Zhangfix op rope, add rope_back (llama/20293)
2026-03-15 Neo Zhangfix for failed UT case: ACC, L2_NORM, UPSCALE, fused_gl...
2026-03-15 Georgi Gerganovggml : bump RPC version (llama/20330)
2026-03-15 Reese Levineggml webgpu: faster normal quant and some k-quant matri...
2026-03-15 Charles Xukleidiai : support for concurrent sme and neon kernel...
2026-03-15 Taimur Ahmadggml-cpu: add RVV repack GEMM and GEMV for quantization...
2026-03-15 Julian Pscheidmetal: handle command buffer failures gracefully in...
2026-03-15 Paul Flynnmetal : extend mul_mv_ext to BF16, Q2_K, Q3_K (llama...
2026-03-15 Georgi Gerganovmetal : add upscale (llama/20284)
2026-03-15 Aman Guptaggml-cuda: disable gdn for musa (llama/20278)
2026-03-15 Bertay Erenggml-vulkan: add SGN operator, auto-generate Vulkan...
2026-03-15 Ruben Ortlamvulkan: skip zero size tensors in backend copies (llama...
2026-03-15 Michael Huangcuda : display total and free VRAM capacity during...
2026-03-15 GiantPrinceggml-vulkan: Add ELU op support (llama/20183)
2026-03-15 Jeff Bolzvulkan: Fix data races in coopmat1 mul_mat(_id) (llama...
2026-03-15 Neo Zhangsupprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama...
2026-03-15 Aman Guptaggml: add GATED_DELTA_NET op (llama/19504)
2026-03-15 lhezopencl: add l2_norm (llama/20160)
2026-03-15 Bartowskiquants : Add memsets and other fixes for IQ quants...
2026-03-15 Piotr Wilkin... Autoparser - complete refactoring of parser architectur...
2026-03-15 Todor Boinovskihexagon: add f32 ssm_conv op (llama/20122)
2026-03-15 Max Krasnyanskycpu: skip redudant ROPE cache updates (llama/20149)
2026-03-15 Aman Guptaggml-cuda: add mem check for fusion (llama/19916)
2026-03-15 Aaron Teoggml: update comments for backends which have no memory...
2026-03-15 shalinib-ibmggml-cpu: Fix gcc 15 ICE on ppc64le (#20083) (llama...
2026-03-15 Aman GuptaCUDA: use shared mem for ssm_conv (llama/20128)
2026-03-15 Johannes Gäßlerggml-cpu: fix data race for debug asserts (llama/20148)
2026-03-15 lhezopencl: add neg, exp and diag (llama/20127)
2026-03-15 YardenTal44hexagon: add fp16 support for binary ops: add,sub,mul...
2026-03-15 Andreas KieslingerCUDA: Improve performance via less synchronizations...
2026-03-15 Marcel Petrickchore : correct typos [no ci] (llama/20041)
2026-03-15 Max Krasnyanskyhexagon: Flash Attention optimizations (dma, mpyacc...
2026-03-15 lhezopencl: add `SET`, support i32 for `CPY`, minor refacto...
2026-03-15 Nikhil JainFix wait logic for inflight jobs (llama/20096)
2026-03-15 Masashi YoshimuraAdd concat op to webgpu. (llama/20068)
2026-03-15 Johannes Gäßlerggml: fix ggml_is_contiguous_n for ne == 1 (llama/20092)
2026-03-15 Adrien Gallouëtggml : use a simple std::thread in AMX without OpenMP...
2026-03-15 Charles Xukleidiai : add sme fp16 compute path for q4_0 gemm...
2026-03-15 shaofeiqiopencl: add optimized q4_1 mm kernel for adreno (llama...
2026-03-15 Abhijit Rameshggml webgpu: fix workgroup dispatch limit for large...
2026-03-15 Nikhil Jainggml webgpu: Clean up per-thread parameter buffer pool...
2026-03-15 Masashi Yoshimuraggml-webgpu: Support non-contiguous `src0` and overlapp...
2026-03-15 Ruben Ortlamvulkan: tune MMVQ for Intel Windows (llama/19988)
2026-03-15 Aaron Teoggml-cpu: optimise s390x multiply extend instructions...
2026-03-15 Ruben Ortlamvulkan: improve partial offloading performance on AMD...
2026-03-15 oobaboogacuda: cap grid.y at 65535 in non-contiguous dequantize...
2026-03-15 Jayant LohiaCUDA: add CDNA3 MFMA support for flash attention MMA...
2026-03-15 Aman Guptaggml-cpu: add repack for mxfp4 (llama/19738)
2026-03-15 David366AIexamples/yolo: fix load_model memory leak (#1432)
2026-02-27 Georgi Gerganovgguf : sync (llama/0)
2026-02-27 Georgi Gerganovscripts : sync gguf code
2026-02-27 Georgi Gerganovsync : llama.cpp
2026-02-27 Neo Zhangreplace the magic nunber 768 by max work group size...
2026-02-27 Vishal Singhggml-zendnn: update code for latest ZenDNN API (llama...
2026-02-27 Adrien Gallouëtggml : fix AMX and add batched support (llama/19925)
2026-02-27 Ruben Ortlamvulkan: fix fp16 Flash Attention on Windows AMD RDNA2...
2026-02-27 Kevin Pougetggml-virtgpu: improve the reliability of the code ...
2026-02-27 Neo Zhangsupport permuted, remove check s0/s10 (llama/19889)
2026-02-27 Jeff Bolzvulkan: check for memory overlap before doing fusion...
2026-02-25 Georgi Gerganovsync : llama.cpp
2026-02-25 Georgi Gerganovggml/gguf : prevent integer overflows (llama/19856)
2026-02-25 Ruben OrtlamVulkan Scalar Flash Attention Refactor (llama/19625)
2026-02-25 Jeff Bolzvulkan: fix coopmat1 without bf16 support (llama/19793)
2026-02-25 Jeff Bolzvulkan: fix data race in mul_mat_id shader (llama/19790)
2026-02-25 Max Krasnyanskyhexagon refactor all Ops to use local context struct...
2026-02-25 Alberto Cabrera... ggml-cpu: arm64: q5_K repack gemm and gemv (and generic...
2026-02-25 Gaurav GargImprove CUDA graph capture (llama/19754)
2026-02-25 Taimur Ahmadggml-cpu: add RVV vec dot kernels for quantization...
2026-02-25 Jeff Bolztest: mul_mat tests with huge batch size (llama/19519)
2026-02-25 Masashi Yoshimuraggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support...
2026-02-25 Ruben Ortlamvulkan: fix MMQ shader push constants and multi-dispatc...
2026-02-25 Johannes GäßlerCUDA: fix kernel selection logic for tile FA (llama...
2026-02-25 shalinib-ibmllamafile: powerpc: add FP16 MMA path for Q4/Q8 matmul...
2026-02-25 Reese Levineggml webgpu: Fix bug in dispatching large matrix-vector...
2026-02-25 Georgi Gerganovsync : llama.cpp
2026-02-25 Reese Levineggml webgpu: shader library organization (llama/19530)
2026-02-25 Georgi Gerganovsync : llama.cpp
2026-02-25 Jeff Bolzvulkan: split mul_mat into multiple dispatches to avoid...
2026-02-25 shaofeiqiopencl: refactor expm1 and softplus (llama/19404)
2026-02-25 shaofeiqiopencl: optimize mean and sum_row kernels (llama/19614)
2026-02-25 Talha Can Havadarggml: ggml-cpu: force-no-lto-for-cpu-feats (llama/19609)
2026-02-25 Georgi Gerganovcuda : enable CUDA graphs for MMID 1 <= BS <= 4 (llama...
2026-02-25 Juddggml : make `ggml_is_view` as API (llama/19539)
2026-02-25 Mario LimoncielloAdjust workaround for ROCWMMA_FATTN/GFX9 to only newer...
2026-02-25 abhijain1204fujitsuggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k...
2026-02-25 David Friehscuda: optimize iq2xxs/iq2xs/iq3xxs dequantization ...
2026-02-25 Daniel Beveniuscmake : check if KleidiAI API has been fetched (llama...
2026-02-25 Georgi Gerganovggml : avoid UB in gemm ukernel (llama/19642)
2026-02-25 Aaron Teoggml-cpu: optimize ggml_vec_dot_bf16 for s390x (llama...
next