git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog

]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog

overview / pkg / ggml / sources / ggml / shortlog

2025-11-01	Jeff Bolz	vulkan: Handle argsort with a large number of rows...	commit \| commitdiff \| tree
2025-11-01	Oliver Simons	Hide latency of bias and gate-loading (llama/16847)	commit \| commitdiff \| tree
2025-11-01	Jeff Bolz	vulkan: Fuse rope+set_rows (llama/16769)	commit \| commitdiff \| tree
2025-11-01	Jeff Bolz	vulkan: Update topk_moe fusion to handle gpt's late...	commit \| commitdiff \| tree
2025-11-01	Ruben Ortlam	Vulkan MMQ Integer Dot Refactor and K-Quant support...	commit \| commitdiff \| tree
2025-11-01	Max Krasnyansky	Hexagon Op queue & dispatch optimizations (llama/16820)	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: use fastdiv in set-rows (llama/16834)	commit \| commitdiff \| tree
2025-11-01	Jeff Bolz	vulkan: Call ggml_vk_buffer_write_2d from ggml_vk_buffe...	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: Fix bug in topk-moe for gpt-oss (llama/16821)	commit \| commitdiff \| tree
2025-11-01	YaelLogic	sycl: add RMS_NORM_BACK operation support (llama/16808)	commit \| commitdiff \| tree
2025-11-01	YaelGitAccount	cuda: add SET operation support (llama/16804)	commit \| commitdiff \| tree
2025-11-01	l3utterfly	initialise buffer.device in ggml_hexagon_session (llama...	commit \| commitdiff \| tree
2025-11-01	Chenguang Li	CANN: Improve device ID handling and aclnnArange checks...	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: add unused vars to mmvf and mmvq (llama/16807)	commit \| commitdiff \| tree
2025-11-01	tamarPal	sycl: add SSM_CONV operation support (llama/16800)	commit \| commitdiff \| tree
2025-11-01	Acly	ggml : fix interpolate with align-corners and ne=1...	commit \| commitdiff \| tree
2025-11-01	Johannes Gäßler	HIP: fix AMDGPU_TARGETS, update documentation (llama...	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	test-backend-ops: print failed tests at the end (llama...	commit \| commitdiff \| tree
2025-11-01	tamarPal	sycl: add ROLL operation support (llama/16665)	commit \| commitdiff \| tree
2025-11-01	shani-f	sycl: add REPEAT_BACK operation support (llama/16734)	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: support for weight clamp in top-k norm (llama...	commit \| commitdiff \| tree
2025-11-01	Acly	ggml-alloc : make gallocr prefer chunks that allow...	commit \| commitdiff \| tree
2025-11-01	Sigbjørn Skjæret	cuda : use fast copy when src and dst are of different...	commit \| commitdiff \| tree
2025-11-01	leejet	ggml: fix cuda kernel launch configuration for k_comput...	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: General GEMV fusion (llama/16715)	commit \| commitdiff \| tree
2025-11-01	Gilad S.	vulkan: deduplicate Microsoft Direct3D12 devices (llama...	commit \| commitdiff \| tree
2025-11-01	Giuseppe Scrivano	vulkan: delete dead code (llama/16732)	commit \| commitdiff \| tree
2025-11-01	Jeff Bolz	vulkan: Optimize SSM_SCAN (llama/16645)	commit \| commitdiff \| tree
2025-11-01	leejet	ggml: fix CUDA grid launch condition for large block_nu...	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: use CUB for arbitary size argsort (llama/16754)	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	ggml-cuda: use passed ops instead of hardcoded ops...	commit \| commitdiff \| tree
2025-11-01	Matthew Michel	sycl: use async memory allocation to fix crashes during...	commit \| commitdiff \| tree
2025-11-01	Max Krasnyansky	Add experimental ggml-hexagon backend for the Hexagon...	commit \| commitdiff \| tree
2025-11-01	Diego Devesa	Revert "ggml : Leverage the existing GGML_F32_VEC helpe...	commit \| commitdiff \| tree
2025-11-01	sirus20x6	ggml : Leverage the existing GGML_F32_VEC helpers to...	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: fix bug in topk-moe softmax (llama/16711)	commit \| commitdiff \| tree
2025-11-01	Aman Gupta	CUDA: topk-moe: add optional parameter for gpt-oss...	commit \| commitdiff \| tree
2025-11-01	Johannes Gäßler	CUDA: better error for FA kernel with 0 occupancy ...	commit \| commitdiff \| tree
2025-10-29	Jeff Bolz	Rewrite simple-backend to use sched and ggml_backend_lo...	commit \| commitdiff \| tree
2025-10-22	Georgi Gerganov	sync : whisper.cpp	commit \| commitdiff \| tree
2025-10-21	Georgi Gerganov	sync : llama.cpp	commit \| commitdiff \| tree
2025-10-21	Aman Gupta	ggml: add ggml_can_fuse_subgraph (llama/16662)	commit \| commitdiff \| tree
2025-10-21	lhez	opencl: fix warnings and clean up profiling (llama...	commit \| commitdiff \| tree
2025-10-21	Jeff Bolz	vulkan: Handle FA with all -inf mask values (llama...	commit \| commitdiff \| tree
2025-10-21	YehuditE	sycl : add PAD_REFLECT_D1 operator support (llama/16145)	commit \| commitdiff \| tree
2025-10-21	Diego Devesa	ggml-alloc : fix leak when reusing a tensor with a...	commit \| commitdiff \| tree
2025-10-21	safranowith	SYCL: Add support for FLOOR,CEIL,ROUND and TRUNC unary...	commit \| commitdiff \| tree
2025-10-21	Aaron Teo	ci : fix binaries release failure for s390x (binaries...	commit \| commitdiff \| tree
2025-10-21	Johannes Gäßler	HIP: fix GPU_TARGETS (llama/16642)	commit \| commitdiff \| tree
2025-10-21	Jeff Bolz	vulkan: Implement topk_moe fused shader, ported from...	commit \| commitdiff \| tree
2025-10-21	Aman Gupta	CUDA: use registers instead of smem in topk-moe (llama...	commit \| commitdiff \| tree
2025-10-21	Shawn Gu	opencl: transposed gemm/gemv moe kernel with mxfp4...	commit \| commitdiff \| tree
2025-10-21	Radoslav Gerganov	rpc : report actual free memory (llama/16616)	commit \| commitdiff \| tree
2025-10-21	Giuseppe Scrivano	vulkan: Add State Space Model (SSM) Operations Support...	commit \| commitdiff \| tree
2025-10-21	muggle-stack	ggml : fix SpaceMit IME array out-of-bounds in task...	commit \| commitdiff \| tree
2025-10-21	Jeff Bolz	vulkan: fix debug build (add_rms_len/data not found...	commit \| commitdiff \| tree
2025-10-21	Ilia Ilmer	metal : add `CONV_TRANSPOSE_2D` (llama/16542)	commit \| commitdiff \| tree
2025-10-21	GittyBurstein	SYCL SET operator optimized for F32 tensors (llama...	commit \| commitdiff \| tree
2025-10-21	GittyBurstein	sycl : add ARANGE operator (llama/16362)	commit \| commitdiff \| tree
2025-10-21	Chenguang Li	CANN: format code using .clang-format (llama/15863)	commit \| commitdiff \| tree
2025-10-21	takuya kodama	ggml-cpu: replace putenv with setenv for const-correctn...	commit \| commitdiff \| tree
2025-10-21	yael-works	SYCL: Add GGML_OP_MEAN operator support (llama/16009)	commit \| commitdiff \| tree
2025-10-21	safranowith	cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators...	commit \| commitdiff \| tree
2025-10-21	lhez	opencl: add q8_0 mm support (llama/16469)	commit \| commitdiff \| tree
2025-10-21	lhez	opencl: fix FA for f32 (llama/16584)	commit \| commitdiff \| tree
2025-10-21	Sam/Samuel	metal: optimise `GGML_OP_SUM` (llama/16559)	commit \| commitdiff \| tree
2025-10-21	Julius Tischbein	CUDA: Changing the CUDA scheduling strategy to spin...	commit \| commitdiff \| tree
2025-10-21	Georgi Gerganov	metal : avoid using Metal's gpuAddress property (llama...	commit \| commitdiff \| tree
2025-10-14	Georgi Gerganov	sync : llama.cpp upstream/latest upstream/0.9.4.58	commit \| commitdiff \| tree
2025-10-14	SavicStefan	vulkan: Add ACC_TYPE_VEC2 implementation (llama/16203)	commit \| commitdiff \| tree
2025-10-14	Aman Gupta	CUDA + openCL: fix bug in accessing rms_norm->src while...	commit \| commitdiff \| tree
2025-10-14	Jeff Bolz	vulkan: Support FA with K/V in F32 (llama/16543)	commit \| commitdiff \| tree
2025-10-14	Jeff Bolz	vulkan: Improve build time for MSVC (llama/16545)	commit \| commitdiff \| tree
2025-10-14	Johannes Gäßler	CUDA: enable FA for FP32 KV cache (llama/16546)	commit \| commitdiff \| tree
2025-10-14	Aman Gupta	CUDA: use fastdiv + ggml_cuda_mad for mmvf (llama/16557)	commit \| commitdiff \| tree
2025-10-14	Aman Gupta	CUDA: add fp kernel for larger batch size MoE (llama...	commit \| commitdiff \| tree
2025-10-14	Anav Prasad	cuda : remove legacy copy-op pointer indirection code...	commit \| commitdiff \| tree
2025-10-14	Georgi Gerganov	metal : FA support F32 K and V and head size = 32 ...	commit \| commitdiff \| tree
2025-10-14	lhez	opencl: fix build targeting CL 2 (llama/16554)	commit \| commitdiff \| tree
2025-10-14	Johannes Gäßler	CUDA: fix numerical issues in tile FA kernel (llama...	commit \| commitdiff \| tree
2025-10-14	Jie Fu (傅杰)	ggml : fix build broken with -march=armv9-a on MacOS...	commit \| commitdiff \| tree
2025-10-14	Chenguang Li	CANN: fix CPU memory leak in CANN backend (llama/16549)	commit \| commitdiff \| tree
2025-10-14	Sam/Samuel	metal: add support for opt_step_sgd (llama/16539)	commit \| commitdiff \| tree
2025-10-14	Georgi Gerganov	ggml : fix scalar path for computing norm (llama/16558)	commit \| commitdiff \| tree
2025-10-14	hipudding	CANN: Update several operators to support FP16 data...	commit \| commitdiff \| tree
2025-10-14	Sam/Samuel	metal : add opt_step_adamw and op_sum (llama/16529)	commit \| commitdiff \| tree
2025-10-14	Neo Zhang Jianyu	fix UT fault cases: count-equal, argsort, pad OPs ...	commit \| commitdiff \| tree
2025-10-14	sirus20x6	ggml : Fix FP16 ELU positive branch (llama/16519)	commit \| commitdiff \| tree
2025-10-14	sirus20x6	ggml: Correct SVE implementation in ggml_vec_dot_f16_un...	commit \| commitdiff \| tree
2025-10-14	Johannes Gäßler	CUDA: faster tile FA, add oob checks, more HSs (llama...	commit \| commitdiff \| tree
2025-10-12	Georgi Gerganov	sync : llama.cpp	commit \| commitdiff \| tree
2025-10-12	Georgi Gerganov	metal : fix mul-mm condition + fix mul-mv permuted...	commit \| commitdiff \| tree
2025-10-12	Diego Devesa	cuda : avoid initializing unused devices (llama/16510)	commit \| commitdiff \| tree
2025-10-12	Prajwal B Mehendarkar	cmake : Dont define XOPENSOURCE on AIX (llama/16481)	commit \| commitdiff \| tree
2025-10-12	duduta	cpu : optimize the ggml NORM operation (llama/15953)	commit \| commitdiff \| tree
2025-10-12	Chenguang Li	CANN: Improve ACL graph matching (llama/16166)	commit \| commitdiff \| tree
2025-10-12	Charles Xu	kleidiai: kernel interface refactoring (llama/16460)	commit \| commitdiff \| tree
2025-10-12	Neo Zhang Jianyu	refactor soft_max, add soft_max_back (llama/16472)	commit \| commitdiff \| tree
2025-10-12	ai-fonsi	Disable CUDA host buffers on integrated GPUs (llama...	commit \| commitdiff \| tree
2025-10-12	Georgi Gerganov	metal : mark FA blocks (llama/16372)	commit \| commitdiff \| tree
next

Packaging of ggml-org/ggml

RSS Atom