]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-03-03 Mickael Desgrangesdocs: Fix intel documentation link (#20040)
2026-03-03 Charles Xukleidiai : add sme fp16 compute path for q4_0 gemm...
2026-03-03 shaofeiqiopencl: add optimized q4_1 mm kernel for adreno (#19840)
2026-03-03 Abhijit Rameshggml webgpu: fix workgroup dispatch limit for large...
2026-03-02 Nikhil Jainggml webgpu: Clean up per-thread parameter buffer pool...
2026-03-02 Masashi Yoshimuraggml-webgpu: Support non-contiguous `src0` and overlapp...
2026-03-02 Ruben Ortlamvulkan: tune MMVQ for Intel Windows (#19988)
2026-03-02 Adrien Gallouëtscripts : improve get-wikitext-2.sh (#19952)
2026-03-02 Aaron Teoggml-cpu: optimise s390x multiply extend instructions...
2026-03-01 Ruben Ortlamvulkan: improve partial offloading performance on AMD...
2026-03-01 oobaboogacuda: cap grid.y at 65535 in non-contiguous dequantize...
2026-02-28 Dmitry Atamanovvendors : update miniaudio library to 0.11.24 (#19914)
2026-02-28 Adrien Gallouëtvendor : update cpp-httplib to 0.35.0 (#19969)
2026-02-28 Bartowskitests : model metadata loading from huggingface (#19796)
2026-02-27 Jayant LohiaCUDA: add CDNA3 MFMA support for flash attention MMA...
2026-02-27 Roj234server: Add pragma once to server-context.h (#19944)
2026-02-27 Sami Kamaserver: Mirroring /v1/responses to /responses to match...
2026-02-27 Daniel Beveniusci : use ubuntu-latest for gguf-publish workflow (...
2026-02-27 Aman Guptaggml-cpu: add repack for mxfp4 (#19738)
2026-02-27 Daniel Beveniusgguf-py : dump version to 0.18.0 (#19950) gguf-v0.18.0
2026-02-27 Pascalserver : support multiple model aliases via comma-separ...
2026-02-27 Jan Patrick... tests : enable test-chat out of tree build (#19558)
2026-02-27 Neo Zhangreplace the magic nunber 768 by max work group size...
2026-02-27 Vishal Singhggml-zendnn: update code for latest ZenDNN API (#19923)
2026-02-26 Adrien Gallouëtggml : fix AMX and add batched support (#19925)
2026-02-26 Ruben Ortlamvulkan: fix fp16 Flash Attention on Windows AMD RDNA2...
2026-02-26 Georgi Gerganovmtmd : fix padding of n_tokens (#19930)
2026-02-26 Georgi Gerganovserver : fix ctx checkpoint restore logic (#19924)
2026-02-26 Georgi Gerganovkv-cache : fix can_shift() check to take into account...
2026-02-26 Aman Guptallama: Add option to merge gate and exp weights (#19139)
2026-02-26 Kevin Pougetggml-virtgpu: improve the reliability of the code ...
2026-02-26 drrrosserver: fix load-on-startup not respected in ini file...
2026-02-26 Eric Zhangjinja : correct default size for string slices (#19913)
2026-02-26 Maximilian... model : add Jina Embeddings v5 Nano (partial EuroBERT...
2026-02-26 Georgi Gerganovgguf : avoid too many file size calls (#19919)
2026-02-26 yggdrasil75server : fix typo in server README.md (#19900)
2026-02-26 Neo Zhangsupport permuted, remove check s0/s10 (#19889)
2026-02-25 Jeff Bolzvulkan: check for memory overlap before doing fusion...
2026-02-25 ddh0common : add more aliases for sampler CLI params (...
2026-02-25 Slobodan Josicci : update the ROCm/HIP toolchain versions [no ci...
2026-02-25 Georgi Gerganovserver : enable multi-modal prompt caching (#19877)
2026-02-25 Georgi Gerganovserver : support multi-modal context checkpoints (...
2026-02-25 Xuan-Son Nguyenscripts: update corpus of compare-logprobs (#19326)
2026-02-25 Mario Limoncielloci : update Windows ROCm build to 26.Q1 [no ci] (#19810)
2026-02-25 Aldehir Rojasgguf : fix ftell/fseek for Windows (#19870)
2026-02-24 Georgi Gerganovmodels : fix graph splits (#19866)
2026-02-24 Pascalserver: fix query params lost when proxying requests...
2026-02-24 Georgi Gerganovggml/gguf : prevent integer overflows (#19856)
2026-02-24 Tarek Dakhranmodel : update label for LFM2-24B-A2B (#19848)
2026-02-24 Radoslav Gerganovserver : support max_completion_tokens request property...
2026-02-24 Ruben OrtlamVulkan Scalar Flash Attention Refactor (#19625)
2026-02-24 Jeff Bolzvulkan: fix coopmat1 without bf16 support (#19793)
2026-02-24 Jeff Bolzvulkan: fix data race in mul_mat_id shader (#19790)
2026-02-24 Max Krasnyanskyhexagon refactor all Ops to use local context struct...
2026-02-23 Aleksander... feat: Add code blocks full height setting to parameter...
2026-02-23 Adrien Gallouëtvendor : update cpp-httplib to 0.34.0 (#19830)
2026-02-23 Daniel Beveniustests : fix typos in comments in test-backend-sampler...
2026-02-23 Aleksander... webui: Add setting to have full height Code Blocks...
2026-02-23 Daniel Beveniusmodel-conversion : merge inspect-org-model.py with...
2026-02-23 Alberto Cabrera... ggml-cpu: arm64: q5_K repack gemm and gemv (and generic...
2026-02-23 Daniel Beveniusllama : remove write/read of output ids/logits/embeddin...
2026-02-22 Sigbjørn Skjæretcli : provide model with text filename (#19783)
2026-02-22 Xuan-Son Nguyenjinja: correct stats for tojson and string filters...
2026-02-22 Aldehir Rojascommon : fix improper trimming in XML parser on complet...
2026-02-22 Kilian KrampfFix wrong cli-argument in documentation (#19804)
2026-02-22 HelloKSmodel : add Kanana-2 model support (#19803)
2026-02-22 Sigbjørn Skjæretci : fix rocm archive name [no ci] (#19808)
2026-02-22 Aldehir Rojasserver : merge contiguous Responses input items into...
2026-02-22 Sigbjørn Skjæretci : fix rocm release path [no ci] (#19784)
2026-02-21 Mario LimoncielloUpdate ROCm docker container to 7.2 release (#19418)
2026-02-21 Mario LimoncielloAdd a build target to generate ROCm artifacts using...
2026-02-21 Adrien Gallouëtvendor : update cpp-httplib to 0.33.1 (#19778)
2026-02-21 Gaurav GargImprove CUDA graph capture (#19754)
2026-02-21 crsawyerfix: UI single model selection in router mode (#19767)
2026-02-21 Mengsheng Wuhexagon : fix build release (#19444) (#19587)
2026-02-20 Aldehir Rojascommon : merge qwen3-coder and nemotron nano 3 parsers...
2026-02-20 Taimur Ahmadggml-cpu: add RVV vec dot kernels for quantization...
2026-02-20 ddh0quantize : add --dry-run option (#19526)
2026-02-20 Jeff Bolztest: mul_mat tests with huge batch size (#19519)
2026-02-19 crsawyerWebUI hide models in router mode (#19374)
2026-02-19 Jesse Posnercommon : fix Step-3.5-Flash format detection and thinki...
2026-02-19 abhijitb11common : fix gpt-oss Jinja error when assistant message...
2026-02-19 Masashi Yoshimuraggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support...
2026-02-19 megeminimodel: Add PaddleOCR-VL model support (#18825)
2026-02-19 Ruben Ortlamvulkan: fix MMQ shader push constants and multi-dispatc...
2026-02-19 Georgi Gerganovmodels : fix qwen3.5 beta/gate shapes (#19730)
2026-02-19 Saba Fallahmtmd: build_attn modified, flash_attn on/off via ctx_pa...
2026-02-19 3 a l imodel : add JAIS-2 architecture support (#19488)
2026-02-19 Johannes GäßlerCUDA: fix kernel selection logic for tile FA (#19686)
2026-02-19 Tarek Dakhranmtmd : chat : Fix extra \n between text and media marke...
2026-02-19 Aleksander... webui: Fix Attachments not being included in completion...
2026-02-19 Tarek Dakhranmodel : add tokenizer from LFM2.5-Audio-1.5B (#19687)
2026-02-19 Daniel Beveniusllama : use output_resolve_row() in get_logits_ith...
2026-02-19 Ryan Mangenomodel : full modern bert support (#18330)
2026-02-19 shalinib-ibmllamafile: powerpc: add FP16 MMA path for Q4/Q8 matmul...
2026-02-19 Georgi Gerganovmodels : dedup qwen35 graphs (#19660)
2026-02-19 ymckimodels : dedup Kimi Linear delta net implementation...
2026-02-18 Piotr Wilkin... Add Jinja support for "indent" string filter (#19529)
2026-02-18 Reese Levineggml webgpu: Fix bug in dispatching large matrix-vector...
2026-02-18 matteoserver: save generated text for the /slots endpoint...
next