]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/history
pkg/ggml/sources/llama.cpp
2026-03-06 Aleksander Grygierwebui: Agentic Loop + MCP Client with support for Tools...
2026-03-06 Johannes Gäßlerggml-cpu: fix data race for debug asserts (#20148)
2026-03-06 Georgi Gerganovkv-cache : fix M-RoPE checkpoints (#20132)
2026-03-06 Roj234cli : Don't clear system prompt when using '/clear...
2026-03-06 lhezopencl: add neg, exp and diag (#20127)
2026-03-06 YardenTal44hexagon: add fp16 support for binary ops: add,sub,mul...
2026-03-05 ymckimodels : kda chunk size = 16 (#19827)
2026-03-05 Andreas KieslingerCUDA: Improve performance via less synchronizations...
2026-03-05 Eric Zhangmodel : update Qwen3.5 model type detection (#20126)
2026-03-05 Sigbjørn Skjæretcli : add command and file auto-completion (#19985)
2026-03-05 Sigbjørn Skjæretconvert : register Qwen 3.5 ForCausalLM for text only...
2026-03-05 Aleksander Grygierwebui: Improvements for Models Selector UI (#20066)
2026-03-05 Marcel Petrickchore : correct typos [no ci] (#20041)
2026-03-05 Max Krasnyanskyhexagon: Flash Attention optimizations (dma, mpyacc...
2026-03-05 lhezopencl: add `SET`, support i32 for `CPY`, minor refacto...
2026-03-04 Todor Boinovskihexagon: add llama-completion runner script (#20095)
2026-03-04 Nikhil Jain[WebGPU] Fix wait logic for inflight jobs (#20096)
2026-03-04 Masashi YoshimuraAdd concat op to webgpu. (#20068)
2026-03-04 Sigbjørn Skjærettools : add missing clocale include in mtmd-cli [no...
2026-03-04 Johannes Gäßlerggml: fix ggml_is_contiguous_n for ne == 1 (#20092)
2026-03-04 Adrien Gallouëtggml : use a simple std::thread in AMX without OpenMP...
2026-03-04 ddh0impl : use 6 digits for tensor dims (#20094)
2026-03-04 SamareshSinghFix locale-dependent float printing in GGUF metadata...
2026-03-04 standby24x7completion : Fix a typo in warning message (#20082)
2026-03-03 Mickael Desgrangesdocs: Fix intel documentation link (#20040)
2026-03-03 Charles Xukleidiai : add sme fp16 compute path for q4_0 gemm...
2026-03-03 shaofeiqiopencl: add optimized q4_1 mm kernel for adreno (#19840)
2026-03-03 Abhijit Rameshggml webgpu: fix workgroup dispatch limit for large...
2026-03-02 Nikhil Jainggml webgpu: Clean up per-thread parameter buffer pool...
2026-03-02 Masashi Yoshimuraggml-webgpu: Support non-contiguous `src0` and overlapp...
2026-03-02 Ruben Ortlamvulkan: tune MMVQ for Intel Windows (#19988)
2026-03-02 Adrien Gallouëtscripts : improve get-wikitext-2.sh (#19952)
2026-03-02 Aaron Teoggml-cpu: optimise s390x multiply extend instructions...
2026-03-01 Ruben Ortlamvulkan: improve partial offloading performance on AMD...
2026-03-01 oobaboogacuda: cap grid.y at 65535 in non-contiguous dequantize...
2026-02-28 Dmitry Atamanovvendors : update miniaudio library to 0.11.24 (#19914)
2026-02-28 Adrien Gallouëtvendor : update cpp-httplib to 0.35.0 (#19969)
2026-02-28 Bartowskitests : model metadata loading from huggingface (#19796)
2026-02-27 Jayant LohiaCUDA: add CDNA3 MFMA support for flash attention MMA...
2026-02-27 Roj234server: Add pragma once to server-context.h (#19944)
2026-02-27 Sami Kamaserver: Mirroring /v1/responses to /responses to match...
2026-02-27 Daniel Beveniusci : use ubuntu-latest for gguf-publish workflow (...
2026-02-27 Aman Guptaggml-cpu: add repack for mxfp4 (#19738)
2026-02-27 Daniel Beveniusgguf-py : dump version to 0.18.0 (#19950) gguf-v0.18.0
2026-02-27 Pascalserver : support multiple model aliases via comma-separ...
2026-02-27 Jan Patrick Lehrtests : enable test-chat out of tree build (#19558)
2026-02-27 Neo Zhangreplace the magic nunber 768 by max work group size...
2026-02-27 Vishal Singhggml-zendnn: update code for latest ZenDNN API (#19923)
2026-02-26 Adrien Gallouëtggml : fix AMX and add batched support (#19925)
2026-02-26 Ruben Ortlamvulkan: fix fp16 Flash Attention on Windows AMD RDNA2...
2026-02-26 Georgi Gerganovmtmd : fix padding of n_tokens (#19930)
2026-02-26 Georgi Gerganovserver : fix ctx checkpoint restore logic (#19924)
2026-02-26 Georgi Gerganovkv-cache : fix can_shift() check to take into account...
2026-02-26 Aman Guptallama: Add option to merge gate and exp weights (#19139)
2026-02-26 Kevin Pougetggml-virtgpu: improve the reliability of the code ...
2026-02-26 drrrosserver: fix load-on-startup not respected in ini file...
2026-02-26 Eric Zhangjinja : correct default size for string slices (#19913)
2026-02-26 Maximilian Werkmodel : add Jina Embeddings v5 Nano (partial EuroBERT...
2026-02-26 Georgi Gerganovgguf : avoid too many file size calls (#19919)
2026-02-26 yggdrasil75server : fix typo in server README.md (#19900)
2026-02-26 Neo Zhangsupport permuted, remove check s0/s10 (#19889)
2026-02-25 Jeff Bolzvulkan: check for memory overlap before doing fusion...
2026-02-25 ddh0common : add more aliases for sampler CLI params (...
2026-02-25 Slobodan Josicci : update the ROCm/HIP toolchain versions [no ci...
2026-02-25 Georgi Gerganovserver : enable multi-modal prompt caching (#19877)
2026-02-25 Georgi Gerganovserver : support multi-modal context checkpoints (...
2026-02-25 Xuan-Son Nguyenscripts: update corpus of compare-logprobs (#19326)
2026-02-25 Mario Limoncielloci : update Windows ROCm build to 26.Q1 [no ci] (#19810)
2026-02-25 Aldehir Rojasgguf : fix ftell/fseek for Windows (#19870)
2026-02-24 Georgi Gerganovmodels : fix graph splits (#19866)
2026-02-24 Pascalserver: fix query params lost when proxying requests...
2026-02-24 Georgi Gerganovggml/gguf : prevent integer overflows (#19856)
2026-02-24 Tarek Dakhranmodel : update label for LFM2-24B-A2B (#19848)
2026-02-24 Radoslav Gerganovserver : support max_completion_tokens request property...
2026-02-24 Ruben OrtlamVulkan Scalar Flash Attention Refactor (#19625)
2026-02-24 Jeff Bolzvulkan: fix coopmat1 without bf16 support (#19793)
2026-02-24 Jeff Bolzvulkan: fix data race in mul_mat_id shader (#19790)
2026-02-24 Max Krasnyanskyhexagon refactor all Ops to use local context struct...
2026-02-23 Aleksander Grygierfeat: Add code blocks full height setting to parameter...
2026-02-23 Adrien Gallouëtvendor : update cpp-httplib to 0.34.0 (#19830)
2026-02-23 Daniel Beveniustests : fix typos in comments in test-backend-sampler...
2026-02-23 Aleksander Grygierwebui: Add setting to have full height Code Blocks...
2026-02-23 Daniel Beveniusmodel-conversion : merge inspect-org-model.py with...
2026-02-23 Alberto Cabrera... ggml-cpu: arm64: q5_K repack gemm and gemv (and generic...
2026-02-23 Daniel Beveniusllama : remove write/read of output ids/logits/embeddin...
2026-02-22 Sigbjørn Skjæretcli : provide model with text filename (#19783)
2026-02-22 Xuan-Son Nguyenjinja: correct stats for tojson and string filters...
2026-02-22 Aldehir Rojascommon : fix improper trimming in XML parser on complet...
2026-02-22 Kilian KrampfFix wrong cli-argument in documentation (#19804)
2026-02-22 HelloKSmodel : add Kanana-2 model support (#19803)
2026-02-22 Sigbjørn Skjæretci : fix rocm archive name [no ci] (#19808)
2026-02-22 Aldehir Rojasserver : merge contiguous Responses input items into...
2026-02-22 Sigbjørn Skjæretci : fix rocm release path [no ci] (#19784)
2026-02-21 Mario LimoncielloUpdate ROCm docker container to 7.2 release (#19418)
2026-02-21 Mario LimoncielloAdd a build target to generate ROCm artifacts using...
2026-02-21 Adrien Gallouëtvendor : update cpp-httplib to 0.33.1 (#19778)
2026-02-21 Gaurav GargImprove CUDA graph capture (#19754)
2026-02-21 crsawyerfix: UI single model selection in router mode (#19767)
2026-02-21 Mengsheng Wuhexagon : fix build release (#19444) (#19587)
2026-02-20 Aldehir Rojascommon : merge qwen3-coder and nemotron nano 3 parsers...
next