git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog

overview / pkg / ggml / sources / llama.cpp / shortlog

2026-01-30	Daniel Bevenius	memory : clarify comments for r_l and s_l tensors ...	commit \| commitdiff \| tree
2026-01-30	Georgi Gerganov	tests : add GQA=20 FA test (#19095)	commit \| commitdiff \| tree
2026-01-30	Daniel Bevenius	convert : add missing return statement for GraniteMoeMo...	commit \| commitdiff \| tree
2026-01-30	Daniel Bevenius	memory : remove unused tmp_buf (#19199)	commit \| commitdiff \| tree
2026-01-30	Antonis Makropoulos	docs: Add LlamaLib to UI projects (#19181)	commit \| commitdiff \| tree
2026-01-30	bssrdf	add tensor type checking as part of cuda graph properti...	commit \| commitdiff \| tree
2026-01-30	s8322	sycl: implement GGML_UNARY_OP_SOFTPLUS (#19114)	commit \| commitdiff \| tree
2026-01-30	RachelMantel	sycl: implement GGML_OP_TRI (#19089)	commit \| commitdiff \| tree
2026-01-30	DDXDB	Fix typos in SYCL documentation (#19162)	commit \| commitdiff \| tree
2026-01-29	Zheyuan Chen	ggml-webgpu: improve flastAttention performance by...	commit \| commitdiff \| tree
2026-01-29	Todor Boinovski	hexagon: enable offloading to Hexagon on Windows on...	commit \| commitdiff \| tree
2026-01-29	Georgi Gerganov	cuda : fix nkvo, offload and cuda graph node properties...	commit \| commitdiff \| tree
2026-01-29	Aldehir Rojas	chat : add parsing for solar-open-100b (#18540)	commit \| commitdiff \| tree
2026-01-29	Andrew Marshall	webui: Update Svelte to fix effect_update_depth_exceede...	commit \| commitdiff \| tree
2026-01-29	Sigbjørn Skjæret	jinja : do not pass empty tools and add some none filte...	commit \| commitdiff \| tree
2026-01-29	yulo	HIP: add mmf for CDNA (#18896)	commit \| commitdiff \| tree
2026-01-29	Georgi Gerganov	arg : add -kvu to llama-batched-bench (#19172)	commit \| commitdiff \| tree
2026-01-29	Vishal Singh	ggml-zendnn : resolve ZenDNN backend cross-module symbo...	commit \| commitdiff \| tree
2026-01-29	Aman Gupta	CUDA: refactor topk-moe to enable more models (GLM...	commit \| commitdiff \| tree
2026-01-29	Neo Zhang	sycl: fix norm kernels: l2_norm, group_norm, rms_norm...	commit \| commitdiff \| tree
2026-01-28	Sigbjørn Skjæret	ci : find latest release with asset for winget (#19161)	commit \| commitdiff \| tree
2026-01-28	Ruben Ortlam	Vulkan Flash Attention Coopmat1 Refactor (#19075)	commit \| commitdiff \| tree
2026-01-28	Sascha Rogmann	spec : add self‑speculative decoding (no draft model...	commit \| commitdiff \| tree
2026-01-28	Daniel Bevenius	convert : yield Mamba2Model/GraniteMoeModel modify_tens...	commit \| commitdiff \| tree
2026-01-28	Patryk Kaminski	ggml-sycl: remove unused syclcompat header (#19140)	commit \| commitdiff \| tree
2026-01-28	Sigbjørn Skjæret	jinja : undefined should be treated as sequence/iterabl...	commit \| commitdiff \| tree
2026-01-28	Oleksandr Kuvshynov	vulkan: handle device dedup on MacOS + Vega II Duo...	commit \| commitdiff \| tree
2026-01-28	Ben Chen	doc: add build instruction to use Vulkan backend on...	commit \| commitdiff \| tree
2026-01-28	Kevin Pouget	ggml: new backend for Virglrenderer API Remoting accele...	commit \| commitdiff \| tree
2026-01-28	Alberto Cabrera...	ggml-cpu: arm64: Q4_K scale unroll and vectorization...	commit \| commitdiff \| tree
2026-01-28	Georgi Gerganov	cuda : fix "V is K view" check for non-unified KV cache...	commit \| commitdiff \| tree
2026-01-28	Georgi Gerganov	CUDA: tune GLM 4.7 Flash FA kernel selection logic...	commit \| commitdiff \| tree
2026-01-28	Georgi Gerganov	server : adjust spec tests to generate up to 16 tokens...	commit \| commitdiff \| tree
2026-01-28	Georgi Gerganov	llama : disable Direct IO by default (#19109)	commit \| commitdiff \| tree
2026-01-28	Daniel Bevenius	sampling : remove sampling branching in output_reserve...	commit \| commitdiff \| tree
2026-01-28	Nikhil Jain	ggml webgpu: Split shared state (webgpu_context) into...	commit \| commitdiff \| tree
2026-01-27	Vishal Singh	ggml-zendnn : update ZenDNN git tag to main branch...	commit \| commitdiff \| tree
2026-01-27	Sigbjørn Skjæret	jinja : implement mixed type object keys (#18955)	commit \| commitdiff \| tree
2026-01-27	David Lima	docs: Remove duplicated word on CUDA build section...	commit \| commitdiff \| tree
2026-01-27	Johannes Gäßler	CUDA: tune GLM 4.7 Flash FA kernel selection logic...	commit \| commitdiff \| tree
2026-01-27	Sigbjørn Skjæret	ci : revert slim runner for winget (#19129)	commit \| commitdiff \| tree
2026-01-27	Alberto Cabrera...	ggml-cpu: aarm64: q6_K repack gemm and gemv (and generi...	commit \| commitdiff \| tree
2026-01-27	Gaurav Garg	[CUDA] Reduce CPU-side stalls due to the CUDA command...	commit \| commitdiff \| tree
2026-01-27	Daniel Bevenius	common : clarify HTTPS build options in error message...	commit \| commitdiff \| tree
2026-01-27	shalinib-ibm	ggml-cpu: Enable FP16 MMA kernels on PPC (#19060)	commit \| commitdiff \| tree
2026-01-27	lhez	opencl: add flattened q6_K mv (#19054)	commit \| commitdiff \| tree
2026-01-26	Johannes Gäßler	CUDA: fix padding of GQA to power of 2 in FA (#19115)	commit \| commitdiff \| tree
2026-01-26	Georgi Gerganov	graph : fix nkvo offload with FA (#19105)	commit \| commitdiff \| tree
2026-01-26	Sigbjørn Skjæret	ci : use new 1vCPU runner for lightweight jobs (#19107)	commit \| commitdiff \| tree
2026-01-26	Georgi Gerganov	model : add correct type for GLM 4.7 Flash (#19106)	commit \| commitdiff \| tree
2026-01-25	Johannes Gäßler	CUDA: faster FA for GQA > 1 but not power of 2 (#19092)	commit \| commitdiff \| tree
2026-01-25	ccbinn	metal : fix recommendedMaxWorkingSetSize availability...	commit \| commitdiff \| tree
2026-01-25	Sigbjørn Skjæret	convert : yield Gemma3N custom_map tensors directly...	commit \| commitdiff \| tree
2026-01-25	Aman Gupta	ggml-cpu: Use tiled FA for prompt-processing (#19012)	commit \| commitdiff \| tree
2026-01-25	Georgi Gerganov	kv-cache : support V-less cache (#19067)	commit \| commitdiff \| tree
2026-01-25	Sigbjørn Skjæret	convert : fix Gemma3N, GraniteMoe and Ernie4.5Moe ...	commit \| commitdiff \| tree
2026-01-25	Georgi Gerganov	completion : fix prompt cache for recurrent models...	commit \| commitdiff \| tree
2026-01-25	Molly Sophia	readme: update RWKV7 model links (#19061)	commit \| commitdiff \| tree
2026-01-25	Jakkala Mahesh	llama: fix integer type consistency in split helpers...	commit \| commitdiff \| tree
2026-01-25	Daniel Bevenius	common : use two decimal places for float arg help...	commit \| commitdiff \| tree
2026-01-25	Bartowski	convert : fix conversion for inheriting models that...	commit \| commitdiff \| tree
2026-01-24	Johannes Gäßler	llama-fit-params: keep explicit --ctx-size 0 (#19070)	commit \| commitdiff \| tree
2026-01-24	Johannes Gäßler	GGUF: check that tensor size is representable (#19072)	commit \| commitdiff \| tree
2026-01-24	Xuan-Son Nguyen	chat: fix language input for translategemma (#19052)	commit \| commitdiff \| tree
2026-01-24	Johannes Gäßler	CUDA: re-use MLA K data for V in MMA FA (#19057)	commit \| commitdiff \| tree
2026-01-24	Aman Gupta	ggml-cuda: enable cuda-graphs for `n-cpu-moe` (#18934)	commit \| commitdiff \| tree
2026-01-24	nullname	ggml-hexagon: flash-attn opt (#19025)	commit \| commitdiff \| tree
2026-01-23	Georgi Gerganov	graph : utilize `ggml_build_forward_select()` to avoid...	commit \| commitdiff \| tree
2026-01-23	Neo Zhang	[SYCL] use malloc to support both iGPU and dGPU in...	commit \| commitdiff \| tree
2026-01-23	Xuan-Son Nguyen	chat : fix translategemma crash on common_chat_format_e...	commit \| commitdiff \| tree
2026-01-23	Daniel Bevenius	model-conversion : use BUILD_DIR variable in all script...	commit \| commitdiff \| tree
2026-01-23	Alberto Cabrera...	ggml-cpu: aarm64: q5_K repack gemm and gemv (and generi...	commit \| commitdiff \| tree
2026-01-23	Aldehir Rojas	cli : load parser definition (#19031)	commit \| commitdiff \| tree
2026-01-22	Xuan-Son Nguyen	server : support preserving reasoning_content in assist...	commit \| commitdiff \| tree
2026-01-22	Georgi Gerganov	mla : make the V tensor a view of K (#18986)	commit \| commitdiff \| tree
2026-01-22	Johannes Gäßler	CUDA: fix alignment check for FA (#19023)	commit \| commitdiff \| tree
2026-01-22	Aman Gupta	convert_hf_to_gguf.py: refactor modify_tensors to call...	commit \| commitdiff \| tree
2026-01-22	lhez	opencl: enable the general fp mm for non-cont input...	commit \| commitdiff \| tree
2026-01-22	Xuan-Son Nguyen	server: do not log certain endpoints (avoid log spam...	commit \| commitdiff \| tree
2026-01-22	Georgi Gerganov	quant : manual overrides of tensor types take precedenc...	commit \| commitdiff \| tree
2026-01-22	Aaron Teo	release: update github api (#19022)	commit \| commitdiff \| tree
2026-01-22	Xuan-Son Nguyen	mtmd : update docs to use llama_model_n_embd_inp (...	commit \| commitdiff \| tree
2026-01-22	손희준	server: Reorder methods in `server-task.cpp` (#19016)	commit \| commitdiff \| tree
2026-01-22	Aman Gupta	CUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953)	commit \| commitdiff \| tree
2026-01-22	shaofeiqi	opencl: add TRI op support (#18979)	commit \| commitdiff \| tree
2026-01-22	Aleksei Nikiforov	ggml-zdnn : mark zDNN buffers as non-host (#18967)	commit \| commitdiff \| tree
2026-01-21	Pádraic Slattery	ci : update GitHub Actions versions [no ci] (#18935)	commit \| commitdiff \| tree
2026-01-21	Mariusz Woloszyn	convert : add Devstral-2 (Ministral3ForCausalLM) arch...	commit \| commitdiff \| tree
2026-01-21	Piotr Wilkin...	jinja: support none\|string (#18995)	commit \| commitdiff \| tree
2026-01-21	Hendrik Erz	fix: Use `tabular-nums` for chat message statistics...	commit \| commitdiff \| tree
2026-01-21	Daniel Bevenius	llama : clarify nemotron-h.cpp comment about RoPE ...	commit \| commitdiff \| tree
2026-01-21	Jeff Bolz	vulkan: Remove transfer_ctx, do everything in compute_c...	commit \| commitdiff \| tree
2026-01-21	Adrien Gallouët	common : improve error message when HTTPS is missing...	commit \| commitdiff \| tree
2026-01-21	손희준	server: /v1/responses (partial) (#18486)	commit \| commitdiff \| tree
2026-01-21	Jeff Bolz	vulkan: support flash attention GQA/split_k with small...	commit \| commitdiff \| tree
2026-01-21	Masato Nakasaka	Revert "vulkan: force full subgroups for flash attentio...	commit \| commitdiff \| tree
2026-01-21	Jeff Bolz	vulkan: Use mul_mat_vec_id for small values of n (...	commit \| commitdiff \| tree
2026-01-21	Tarek Dakhran	memory : add llama_memory_hybrid_iswa (#18601)	commit \| commitdiff \| tree
2026-01-21	Piotr Wilkin...	Fix GLM 4.7 Lite MoE gating func (#18980)	commit \| commitdiff \| tree
2026-01-21	Matthieu Coudron	gguf: display strerrno when cant load a model (#18884)	commit \| commitdiff \| tree
next

Packaging of ggml-org/llama.cpp

RSS Atom