]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-03-08 Jeff Bolzvulkan: Fix data races in coopmat1 mul_mat(_id) (#20084)
2026-03-08 Johannes Gäßlerllama: end-to-end tests (#19802)
2026-03-08 Christopher... readme : update infra list (#20212)
2026-03-08 Piotr Wilkin... Revert to OAI-compatible args (#20213)
2026-03-08 decahedron1server : correct index on finish in OAI completion...
2026-03-08 Neo Zhang[SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8...
2026-03-07 Aman Guptaggml: add GATED_DELTA_NET op (#19504)
2026-03-07 lhezopencl: add l2_norm (#20160)
2026-03-07 Piotr Wilkin... Autoparser: True streaming (#20177)
2026-03-06 Piotr Wilkin... Autoparser: add optional argument reshuffle capability...
2026-03-06 Bartowskiquants : Add memsets and other fixes for IQ quants...
2026-03-06 Piotr Wilkin... Add @pwilkin to CODEOWNERS for autoparser code (#20174)
2026-03-06 Piotr Wilkin... Autoparser - complete refactoring of parser architectur...
2026-03-06 Todor Boinovskihexagon: add f32 ssm_conv op (#20122)
2026-03-06 Tom Vaucourtserver : preserve anthropic thinking blocks in conversi...
2026-03-06 Max Krasnyanskycpu: skip redudant ROPE cache updates (#20149)
2026-03-06 Aman Guptaggml-cuda: add mem check for fusion (#19916)
2026-03-06 Aaron Teoggml: update comments for backends which have no memory...
2026-03-06 shalinib-ibmggml-cpu: Fix gcc 15 ICE on ppc64le (#20083) (#20130)
2026-03-06 Aman GuptaCUDA: use shared mem for ssm_conv (#20128)
2026-03-06 Tim Neumanncontext: ignore zero scale LoRAs when checking sameness...
2026-03-06 Piotr Wilkin... Checkpoint every n tokens: squash (#20087)
2026-03-06 Aleksander... webui: Agentic Loop + MCP Client with support for Tools...
2026-03-06 Johannes Gäßlerggml-cpu: fix data race for debug asserts (#20148)
2026-03-06 Georgi Gerganovkv-cache : fix M-RoPE checkpoints (#20132)
2026-03-06 Roj234cli : Don't clear system prompt when using '/clear...
2026-03-06 lhezopencl: add neg, exp and diag (#20127)
2026-03-06 YardenTal44hexagon: add fp16 support for binary ops: add,sub,mul...
2026-03-05 ymckimodels : kda chunk size = 16 (#19827)
2026-03-05 Andreas KieslingerCUDA: Improve performance via less synchronizations...
2026-03-05 Eric Zhangmodel : update Qwen3.5 model type detection (#20126)
2026-03-05 Sigbjørn Skjæretcli : add command and file auto-completion (#19985)
2026-03-05 Sigbjørn Skjæretconvert : register Qwen 3.5 ForCausalLM for text only...
2026-03-05 Aleksander... webui: Improvements for Models Selector UI (#20066)
2026-03-05 Marcel Petrickchore : correct typos [no ci] (#20041)
2026-03-05 Max Krasnyanskyhexagon: Flash Attention optimizations (dma, mpyacc...
2026-03-05 lhezopencl: add `SET`, support i32 for `CPY`, minor refacto...
2026-03-04 Todor Boinovskihexagon: add llama-completion runner script (#20095)
2026-03-04 Nikhil Jain[WebGPU] Fix wait logic for inflight jobs (#20096)
2026-03-04 Masashi YoshimuraAdd concat op to webgpu. (#20068)
2026-03-04 Sigbjørn Skjærettools : add missing clocale include in mtmd-cli [no...
2026-03-04 Johannes Gäßlerggml: fix ggml_is_contiguous_n for ne == 1 (#20092)
2026-03-04 Adrien Gallouëtggml : use a simple std::thread in AMX without OpenMP...
2026-03-04 ddh0impl : use 6 digits for tensor dims (#20094)
2026-03-04 SamareshSinghFix locale-dependent float printing in GGUF metadata...
2026-03-04 standby24x7completion : Fix a typo in warning message (#20082)
2026-03-03 Mickael Desgrangesdocs: Fix intel documentation link (#20040)
2026-03-03 Charles Xukleidiai : add sme fp16 compute path for q4_0 gemm...
2026-03-03 shaofeiqiopencl: add optimized q4_1 mm kernel for adreno (#19840)
2026-03-03 Abhijit Rameshggml webgpu: fix workgroup dispatch limit for large...
2026-03-02 Nikhil Jainggml webgpu: Clean up per-thread parameter buffer pool...
2026-03-02 Masashi Yoshimuraggml-webgpu: Support non-contiguous `src0` and overlapp...
2026-03-02 Ruben Ortlamvulkan: tune MMVQ for Intel Windows (#19988)
2026-03-02 Adrien Gallouëtscripts : improve get-wikitext-2.sh (#19952)
2026-03-02 Aaron Teoggml-cpu: optimise s390x multiply extend instructions...
2026-03-01 Ruben Ortlamvulkan: improve partial offloading performance on AMD...
2026-03-01 oobaboogacuda: cap grid.y at 65535 in non-contiguous dequantize...
2026-02-28 Dmitry Atamanovvendors : update miniaudio library to 0.11.24 (#19914)
2026-02-28 Adrien Gallouëtvendor : update cpp-httplib to 0.35.0 (#19969)
2026-02-28 Bartowskitests : model metadata loading from huggingface (#19796)
2026-02-27 Jayant LohiaCUDA: add CDNA3 MFMA support for flash attention MMA...
2026-02-27 Roj234server: Add pragma once to server-context.h (#19944)
2026-02-27 Sami Kamaserver: Mirroring /v1/responses to /responses to match...
2026-02-27 Daniel Beveniusci : use ubuntu-latest for gguf-publish workflow (...
2026-02-27 Aman Guptaggml-cpu: add repack for mxfp4 (#19738)
2026-02-27 Daniel Beveniusgguf-py : dump version to 0.18.0 (#19950) gguf-v0.18.0
2026-02-27 Pascalserver : support multiple model aliases via comma-separ...
2026-02-27 Jan Patrick... tests : enable test-chat out of tree build (#19558)
2026-02-27 Neo Zhangreplace the magic nunber 768 by max work group size...
2026-02-27 Vishal Singhggml-zendnn: update code for latest ZenDNN API (#19923)
2026-02-26 Adrien Gallouëtggml : fix AMX and add batched support (#19925)
2026-02-26 Ruben Ortlamvulkan: fix fp16 Flash Attention on Windows AMD RDNA2...
2026-02-26 Georgi Gerganovmtmd : fix padding of n_tokens (#19930)
2026-02-26 Georgi Gerganovserver : fix ctx checkpoint restore logic (#19924)
2026-02-26 Georgi Gerganovkv-cache : fix can_shift() check to take into account...
2026-02-26 Aman Guptallama: Add option to merge gate and exp weights (#19139)
2026-02-26 Kevin Pougetggml-virtgpu: improve the reliability of the code ...
2026-02-26 drrrosserver: fix load-on-startup not respected in ini file...
2026-02-26 Eric Zhangjinja : correct default size for string slices (#19913)
2026-02-26 Maximilian... model : add Jina Embeddings v5 Nano (partial EuroBERT...
2026-02-26 Georgi Gerganovgguf : avoid too many file size calls (#19919)
2026-02-26 yggdrasil75server : fix typo in server README.md (#19900)
2026-02-26 Neo Zhangsupport permuted, remove check s0/s10 (#19889)
2026-02-25 Jeff Bolzvulkan: check for memory overlap before doing fusion...
2026-02-25 ddh0common : add more aliases for sampler CLI params (...
2026-02-25 Slobodan Josicci : update the ROCm/HIP toolchain versions [no ci...
2026-02-25 Georgi Gerganovserver : enable multi-modal prompt caching (#19877)
2026-02-25 Georgi Gerganovserver : support multi-modal context checkpoints (...
2026-02-25 Xuan-Son Nguyenscripts: update corpus of compare-logprobs (#19326)
2026-02-25 Mario Limoncielloci : update Windows ROCm build to 26.Q1 [no ci] (#19810)
2026-02-25 Aldehir Rojasgguf : fix ftell/fseek for Windows (#19870)
2026-02-24 Georgi Gerganovmodels : fix graph splits (#19866)
2026-02-24 Pascalserver: fix query params lost when proxying requests...
2026-02-24 Georgi Gerganovggml/gguf : prevent integer overflows (#19856)
2026-02-24 Tarek Dakhranmodel : update label for LFM2-24B-A2B (#19848)
2026-02-24 Radoslav Gerganovserver : support max_completion_tokens request property...
2026-02-24 Ruben OrtlamVulkan Scalar Flash Attention Refactor (#19625)
2026-02-24 Jeff Bolzvulkan: fix coopmat1 without bf16 support (#19793)
2026-02-24 Jeff Bolzvulkan: fix data race in mul_mat_id shader (#19790)
2026-02-24 Max Krasnyanskyhexagon refactor all Ops to use local context struct...
next