]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
13 days ago Georgi Gerganovsync : llama.cpp
13 days ago Ruben Ortlamvulkan: use vec dot for matrix matrix multiplications...
13 days ago Xuan-Son Nguyenggml : refactor forward_dup for cpu backend (llama...
13 days ago Adrien Gallouëtggml-amx : fix ggml_amx_init() on generic Linux (llama...
13 days ago Adrien Gallouëtcmake : fix static linking for OpenMP on Unix-like...
13 days ago Shawn Guopencl: optimize mxfp4 kernels (llama/16037)
13 days ago Jeff Bolzrename optimize_graph to graph_optimize (llama/16082)
13 days ago Bowen HanCUDA: Optimize PAD_REFLECT_1D (llama/15957)
13 days ago Johannes GäßlerCUDA: fix compilation on CC 6.0 (llama/16091)
13 days ago Georgi Gerganovmetal : use function constants for mul_mv_ext kernels...
13 days ago Sigbjørn Skjæretcuda : add missing F32<->I32 entries in ggml_cuda_cpy_f...
13 days ago Georgi Gerganovmetal : improve F32, F16 and BF16 mat-vec multiplicatio...
13 days ago Jhen-Jie Hongmetal : avoid call free for non-owned buffer (llama...
13 days ago Georgi Gerganovmetal : handle nil cv during pipeline creation (llama...
13 days ago Chenguang LiCANN: Remove print (llama/16044)
13 days ago Reese LevineGGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS...
13 days ago Georgi Gerganovmetal : refactor + optimize v2 (llama/15995)
13 days ago Georgi Gerganovsync : llama.cpp
13 days ago Johannes GäßlerCUDA: fix FA occupancy, optimize tile kernel (llama...
13 days ago Evevulkan: automatically remove unsupported devices (llama...
13 days ago Chenguang LiCANN: Optimize ggml_cann_set_device (llama/15935)
13 days ago Daniel Beveniusggml : fix padding in timestep embedding kernels (llama...
13 days ago Jake KarnesCUDA: fix im2col_3d to respect non-contiguous inputs...
13 days ago yael-worksSYCL: Add COUNT_EQUAL operator support (llama/15991)
13 days ago Aman GuptaCUDA: some micro-optimizations in mmf.cuh for mul_mat_i...
13 days ago Georgi Gerganovmetal : remove memory pools (llama/15966)
13 days ago Ruben OrtlamVulkan: Clean up mul_mm shader (llama/15987)
13 days ago Georgi Gerganovmetal : fix kernel requirements (llama/15983)
13 days ago Aaron Teoggml-zdnn: rm user mapped buffers (llama/15965)
13 days ago Jeff Bolzvulkan: fix failing dequant shaders (llama/15862)
13 days ago Jeff Bolzvulkan: initialize vulkan-hpp to allow using extension...
13 days ago Georgi Gerganovmetal : refactor kernel loading (llama/15964)
13 days ago Georgi Gerganovmetal : allow ops to run concurrently (llama/15929)
13 days ago Georgi Gerganovmetal : fix memory leaks (llama/15962)
13 days ago Aaron Teoggml-zdnn: fix #15414, activate FP16 and BF16 accelerat...
13 days ago Ruben OrtlamVulkan iGPU device selection overhaul and PCI ID API...
13 days ago Mathieu Baudiervulkan: Make device memory check more portable (llama...
13 days ago Neo Zhang JianyuRevert "sycl: add usage of enqueue_functions extension...
13 days ago Diego Devesaggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device...
13 days ago Johannes GäßlerCUDA: larger SRAM reads for tile FA, AMD FP16 dot ...
13 days ago Daniel Beveniusggml-cpu : add check for ARM MATMUL_INT8/i8mm support...
13 days ago Charles Xukleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed...
13 days ago hipuddingCANN: Disable acl_graph for prefill stage (llama/15933)
13 days ago Oliver SimonsCUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3%...
13 days ago Daniel Beveniusggml-cpu : fix padding in ggml_timestep_embedding ...
13 days ago Georgi Gerganovsync : llama.cpp
13 days ago Georgi Gerganovmetal : make the backend async (llama/15906)
13 days ago Georgi Gerganovsync : llama.cpp
13 days ago Daniel Beveniustests : filter out no-ops from coverage report (llama...
13 days ago Chenguang LiCANN: Add ROPE sin/cos cache for reuse (llama/15912)
13 days ago Chenguang LiCANN: implement LRU cache for ACL graphs (llama/15814)
13 days ago Ruben Ortlamvulkan: throw the oom error instead of no memory type...
13 days ago Jeff Bolzvulkan: Fix OOB accesses in soft_max_back (llama/15861)
13 days ago Johannes GäßlerHIP: use v_dot2_f32_f16 instruction for FA (llama/15884)
13 days ago lksj92hsWorkaround for subgroup arithmetic failing on MoltenVK...
13 days ago Aman GuptaCUDA: Add mul_mat_id support for the mmf kernel (llama...
13 days ago Johannes GäßlerCUDA: fix GET_ROWS for large tensors (llama/15882)
13 days ago Jeff Bolzvulkan: sort graph to allow more parallel execution...
13 days ago Aman GuptaCUDA: generate_cu_files.py - add missing mxfp4 (llama...
13 days ago Georgi Gerganovcuda : fix supports_op condition for get_rows when...
13 days ago Georgi Gerganovsync : llama.cpp
13 days ago Georgi Gerganovmetal : refactor + optimize (llama/15857)
13 days ago Georgi Gerganovsync : llama.cpp
13 days ago Xuan-Son Nguyenggml: allow casting between f32 and i32 (llama/15783)
13 days ago Sigbjørn SkjæretCUDA: non-contiguous src0 not supported for PAD (llama...
13 days ago Jeff Bolztests: large sizes for get_rows (llama/15687)
13 days ago Chenguang LiCANN: Stream sync between devices for acl_graph (llama...
13 days ago Jeff Bolzvulkan: support im2col_3d (llama/15795)
13 days ago Aaron Teoggml-cpu: clean up s390x SIMD (llama/15855)
13 days ago Jeff Bolzvulkan: Support pad_ext (llama/15794)
13 days ago Jeff Bolzvulkan: Use larger loads in scalar/coopmat1 matmul...
13 days ago Daniel Beveniusggml WebGPU: remove userdata from request adapter callb...
13 days ago Johannes GäßlerCUDA: faster tile FA (Pascal/AMD), headsize 256 (llama...
13 days ago Charles Xukleidiai: generalize compute_forward_kv_cache to comput...
13 days ago Johannes Gäßlerggml-cpu: document use of "free" memory [no ci] (llama...
13 days ago Aaron Teoggml-cpu: drop support for nnpa intrinsics (llama/15821)
13 days ago Johannes GäßlerCUDA: fastdiv, launch bounds for mmvq + q8_1 quant...
13 days ago Daniel Beveniustests : add --list-ops and --show-coverage options...
2025-09-16 StyMaarUpdate gguf specification to synchronize the `ggml_typ...
2025-09-16 Daniel Beveniusggml : introduce semantic versioning (#1336)
2025-09-10 Gregor JasnyCUDA : conditionally add cuda architectures (#1341)
2025-09-09 distlibsgitignore : ignore idea files (#1339)
2025-09-05 Georgi Gerganovsync : llama.cpp
2025-09-05 Gabe Goodhartmetal : Add template specialization for mul_mm_id w...
2025-09-05 Chenguang LiCANN: Refactor ND to NZ workspace to be per-device...
2025-09-05 leejetggml: add ops for WAN video model (cuda && cpu) (llama...
2025-09-05 hipuddingCANN: Fix precision issue on 310I DUO multi-devices...
2025-09-05 rmatifopencl: add hs=40 to FA (llama/15758)
2025-09-05 Chenguang LiCANN: fix acl_rstd allocation size in ggml_cann_rms_nor...
2025-09-05 Ruben Ortlamvulkan: fix mmv subgroup16 selection (llama/15775)
2025-09-05 Jeff Bolzvulkan: don't use std::string in load_shaders, to impro...
2025-09-05 Daniel Beveniusvulkan : update ggml_vk_instance_validation_ext_availab...
2025-09-05 Shin-myoung... ggml vulkan: add hardsigmoid and hardswish operations...
2025-09-05 Oliver SimonsCUDA: Optimize `rms_norm_f32` kernel and its fused...
2025-09-05 hipuddingCANN: Add RoPE contiguous check for 310I DUP device...
2025-09-05 xctanggml-cpu : optimize RVV kernels (llama/15720)
2025-09-05 hipuddingCANN: Mask unsupported TRANSPOSE_1D operator (llama...
2025-09-05 Chenguang LiCANN: Fix type float_t to float (llama/15736)
2025-09-05 Ruben Ortlamvulkan: fix shaders gen when no integer dot is availabl...
2025-09-05 hipuddingCANN: Resolve soft_max precision issue (llama/15730)
next