13 days ago |
Adrien Gallouët | cmake : fix static linking for OpenMP on Unix-like... |
commit | commitdiff | tree |
13 days ago |
Shawn Gu | opencl: optimize mxfp4 kernels (llama/16037) |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | rename optimize_graph to graph_optimize (llama/16082) |
commit | commitdiff | tree |
13 days ago |
Bowen Han | CUDA: Optimize PAD_REFLECT_1D (llama/15957) |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | CUDA: fix compilation on CC 6.0 (llama/16091) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : use function constants for mul_mv_ext kernels... |
commit | commitdiff | tree |
13 days ago |
Sigbjørn Skjæret | cuda : add missing F32<->I32 entries in ggml_cuda_cpy_f... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : improve F32, F16 and BF16 mat-vec multiplicatio... |
commit | commitdiff | tree |
13 days ago |
Jhen-Jie Hong | metal : avoid call free for non-owned buffer (llama... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : handle nil cv during pipeline creation (llama... |
commit | commitdiff | tree |
13 days ago |
Chenguang Li | CANN: Remove print (llama/16044) |
commit | commitdiff | tree |
13 days ago |
Reese Levine | GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : refactor + optimize v2 (llama/15995) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | CUDA: fix FA occupancy, optimize tile kernel (llama... |
commit | commitdiff | tree |
13 days ago |
Eve | vulkan: automatically remove unsupported devices (llama... |
commit | commitdiff | tree |
13 days ago |
Chenguang Li | CANN: Optimize ggml_cann_set_device (llama/15935) |
commit | commitdiff | tree |
13 days ago |
Daniel Bevenius | ggml : fix padding in timestep embedding kernels (llama... |
commit | commitdiff | tree |
13 days ago |
Jake Karnes | CUDA: fix im2col_3d to respect non-contiguous inputs... |
commit | commitdiff | tree |
13 days ago |
yael-works | SYCL: Add COUNT_EQUAL operator support (llama/15991) |
commit | commitdiff | tree |
13 days ago |
Aman Gupta | CUDA: some micro-optimizations in mmf.cuh for mul_mat_i... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : remove memory pools (llama/15966) |
commit | commitdiff | tree |
13 days ago |
Ruben Ortlam | Vulkan: Clean up mul_mm shader (llama/15987) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : fix kernel requirements (llama/15983) |
commit | commitdiff | tree |
13 days ago |
Aaron Teo | ggml-zdnn: rm user mapped buffers (llama/15965) |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: fix failing dequant shaders (llama/15862) |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: initialize vulkan-hpp to allow using extension... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : refactor kernel loading (llama/15964) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : allow ops to run concurrently (llama/15929) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : fix memory leaks (llama/15962) |
commit | commitdiff | tree |
13 days ago |
Aaron Teo | ggml-zdnn: fix #15414, activate FP16 and BF16 accelerat... |
commit | commitdiff | tree |
13 days ago |
Ruben Ortlam | Vulkan iGPU device selection overhaul and PCI ID API... |
commit | commitdiff | tree |
13 days ago |
Mathieu Baudier | vulkan: Make device memory check more portable (llama... |
commit | commitdiff | tree |
13 days ago |
Neo Zhang Jianyu | Revert "sycl: add usage of enqueue_functions extension... |
commit | commitdiff | tree |
13 days ago |
Diego Devesa | ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device... |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | CUDA: larger SRAM reads for tile FA, AMD FP16 dot ... |
commit | commitdiff | tree |
13 days ago |
Daniel Bevenius | ggml-cpu : add check for ARM MATMUL_INT8/i8mm support... |
commit | commitdiff | tree |
13 days ago |
Charles Xu | kleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed... |
commit | commitdiff | tree |
13 days ago |
hipudding | CANN: Disable acl_graph for prefill stage (llama/15933) |
commit | commitdiff | tree |
13 days ago |
Oliver Simons | CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3%... |
commit | commitdiff | tree |
13 days ago |
Daniel Bevenius | ggml-cpu : fix padding in ggml_timestep_embedding ... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : make the backend async (llama/15906) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
13 days ago |
Daniel Bevenius | tests : filter out no-ops from coverage report (llama... |
commit | commitdiff | tree |
13 days ago |
Chenguang Li | CANN: Add ROPE sin/cos cache for reuse (llama/15912) |
commit | commitdiff | tree |
13 days ago |
Chenguang Li | CANN: implement LRU cache for ACL graphs (llama/15814) |
commit | commitdiff | tree |
13 days ago |
Ruben Ortlam | vulkan: throw the oom error instead of no memory type... |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: Fix OOB accesses in soft_max_back (llama/15861) |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | HIP: use v_dot2_f32_f16 instruction for FA (llama/15884) |
commit | commitdiff | tree |
13 days ago |
lksj92hs | Workaround for subgroup arithmetic failing on MoltenVK... |
commit | commitdiff | tree |
13 days ago |
Aman Gupta | CUDA: Add mul_mat_id support for the mmf kernel (llama... |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | CUDA: fix GET_ROWS for large tensors (llama/15882) |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: sort graph to allow more parallel execution... |
commit | commitdiff | tree |
13 days ago |
Aman Gupta | CUDA: generate_cu_files.py - add missing mxfp4 (llama... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | cuda : fix supports_op condition for get_rows when... |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | metal : refactor + optimize (llama/15857) |
commit | commitdiff | tree |
13 days ago |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
13 days ago |
Xuan-Son Nguyen | ggml: allow casting between f32 and i32 (llama/15783) |
commit | commitdiff | tree |
13 days ago |
Sigbjørn Skjæret | CUDA: non-contiguous src0 not supported for PAD (llama... |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | tests: large sizes for get_rows (llama/15687) |
commit | commitdiff | tree |
13 days ago |
Chenguang Li | CANN: Stream sync between devices for acl_graph (llama... |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: support im2col_3d (llama/15795) |
commit | commitdiff | tree |
13 days ago |
Aaron Teo | ggml-cpu: clean up s390x SIMD (llama/15855) |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: Support pad_ext (llama/15794) |
commit | commitdiff | tree |
13 days ago |
Jeff Bolz | vulkan: Use larger loads in scalar/coopmat1 matmul... |
commit | commitdiff | tree |
13 days ago |
Daniel Bevenius | ggml WebGPU: remove userdata from request adapter callb... |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | CUDA: faster tile FA (Pascal/AMD), headsize 256 (llama... |
commit | commitdiff | tree |
13 days ago |
Charles Xu | kleidiai: generalize compute_forward_kv_cache to comput... |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | ggml-cpu: document use of "free" memory [no ci] (llama... |
commit | commitdiff | tree |
13 days ago |
Aaron Teo | ggml-cpu: drop support for nnpa intrinsics (llama/15821) |
commit | commitdiff | tree |
13 days ago |
Johannes Gäßler | CUDA: fastdiv, launch bounds for mmvq + q8_1 quant... |
commit | commitdiff | tree |
13 days ago |
Daniel Bevenius | tests : add --list-ops and --show-coverage options... |
commit | commitdiff | tree |
2025-09-16 |
StyMaar | Update gguf specification to synchronize the `ggml_typ... |
commit | commitdiff | tree |
2025-09-16 |
Daniel Bevenius | ggml : introduce semantic versioning (#1336) |
commit | commitdiff | tree |
2025-09-10 |
Gregor Jasny | CUDA : conditionally add cuda architectures (#1341) |
commit | commitdiff | tree |
2025-09-09 |
distlibs | gitignore : ignore idea files (#1339) |
commit | commitdiff | tree |
2025-09-05 |
Georgi Gerganov | sync : llama.cpp |
commit | commitdiff | tree |
2025-09-05 |
Gabe Goodhart | metal : Add template specialization for mul_mm_id w... |
commit | commitdiff | tree |
2025-09-05 |
Chenguang Li | CANN: Refactor ND to NZ workspace to be per-device... |
commit | commitdiff | tree |
2025-09-05 |
leejet | ggml: add ops for WAN video model (cuda && cpu) (llama... |
commit | commitdiff | tree |
2025-09-05 |
hipudding | CANN: Fix precision issue on 310I DUO multi-devices... |
commit | commitdiff | tree |
2025-09-05 |
rmatif | opencl: add hs=40 to FA (llama/15758) |
commit | commitdiff | tree |
2025-09-05 |
Chenguang Li | CANN: fix acl_rstd allocation size in ggml_cann_rms_nor... |
commit | commitdiff | tree |
2025-09-05 |
Ruben Ortlam | vulkan: fix mmv subgroup16 selection (llama/15775) |
commit | commitdiff | tree |
2025-09-05 |
Jeff Bolz | vulkan: don't use std::string in load_shaders, to impro... |
commit | commitdiff | tree |
2025-09-05 |
Daniel Bevenius | vulkan : update ggml_vk_instance_validation_ext_availab... |
commit | commitdiff | tree |
2025-09-05 |
Shin-myoung... | ggml vulkan: add hardsigmoid and hardswish operations... |
commit | commitdiff | tree |
2025-09-05 |
Oliver Simons | CUDA: Optimize `rms_norm_f32` kernel and its fused... |
commit | commitdiff | tree |
2025-09-05 |
hipudding | CANN: Add RoPE contiguous check for 310I DUP device... |
commit | commitdiff | tree |
2025-09-05 |
xctan | ggml-cpu : optimize RVV kernels (llama/15720) |
commit | commitdiff | tree |
2025-09-05 |
hipudding | CANN: Mask unsupported TRANSPOSE_1D operator (llama... |
commit | commitdiff | tree |
2025-09-05 |
Chenguang Li | CANN: Fix type float_t to float (llama/15736) |
commit | commitdiff | tree |
2025-09-05 |
Ruben Ortlam | vulkan: fix shaders gen when no integer dot is availabl... |
commit | commitdiff | tree |
2025-09-05 |
hipudding | CANN: Resolve soft_max precision issue (llama/15730) |
commit | commitdiff | tree |
2025-09-05 |
Jeff Bolz | vulkan: Fix macro parameter order for f32 matmul shader... |
commit | commitdiff | tree |
2025-09-05 |
rmatif | opencl: add attn sinks support for FA kernels (llama... |
commit | commitdiff | tree |
2025-09-05 |
Chenguang Li | CANN: Support eager execution mode under ACL graph... |
commit | commitdiff | tree |
2025-09-05 |
hipudding | CANN: Support ext_factor in rope (llama/15710) |
commit | commitdiff | tree |
next |