]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-02-04 Georgi Gerganovtests : add non-cont, inplace rope tests (#19296)
2026-02-04 Daniel Beveniusmodel-conversion : add tensor-info.py utility (#18954)
2026-02-04 Georgi Gerganovspec : fix the check-rate logic of ngram-simple (#19261)
2026-02-04 Daniel Beveniuscompletion : simplify batch (embd) processing (#19286)
2026-02-04 Kevin Pougetggml-virtgpu: make the code thread safe (#19204)
2026-02-04 Aman Guptaggml-cpu: use LUT for converting e8->f32 scales on...
2026-02-03 Georgi Gerganovmetal : add solve_tri (#19302)
2026-02-03 Georgi Gerganovci : add sanitizer runs for server (#19291)
2026-02-03 Georgi Gerganovsampling : delegate input allocation to the scheduler...
2026-02-03 Ruben Ortlamvulkan: disable coopmat1 fa on Nvidia Turing (#19290)
2026-02-03 Aman GuptaCUDA: use mmvq for mul-mat-id for small batch sizes...
2026-02-03 Sigbjørn Skjæretmodels : remove unnecessary cont in openelm (#19289)
2026-02-03 Georgi Gerganovmetal : minor cleanup (#19251)
2026-02-03 Oliver SimonsCUDA: Fix loop unrolling for BW in mul_mat_q_stream_k_f...
2026-02-03 Georgeggml: added cleanups in ggml_quantize_free (#19278)
2026-02-03 Gaurav Gargcuda : revert CUDA_SCALE_LAUNCH_QUEUES override until...
2026-02-03 Alexey Dubrovvocab: add Falcon-H1-Tiny-Coder FIM tokens (#19249)
2026-02-03 Georgi Gerganovspec : simplify time measurement using common_time_meas...
2026-02-02 lhezopencl: refactor some ops, concat, repeat, tanh and...
2026-02-02 Sid Mohanjinja : add missing 'in' test to template engine (...
2026-02-02 Xuan-Son Nguyenmtmd: add min/max pixels gguf metadata (#19273)
2026-02-02 Aman Guptaggml-cpu: FA split across kv for faster TG (#19209)
2026-02-02 Matthieu Coudronserver: print actual model name in 'model not found...
2026-02-02 Aman Guptaci: add test-backend-ops test for CPU (#19268)
2026-02-02 Neo ZhangRemove support for Nvidia & AMD GPU, because the oneAPI...
2026-02-02 Tamarsycl: implement GGML_OP_TOP_K (#19242)
2026-02-02 Georgi Gerganovmetal : support virtual devices (#18919)
2026-02-02 Daniel Beveniusmodel-conversion : add debug option to conversion scrip...
2026-02-02 Johannes Gäßlerggml-backend: fix async set/get fallback sync (#19179)
2026-02-02 Georgi Gerganovauthors : update (#19263)
2026-02-02 Christian Kastnerdocs : Minor cleanups (#19252)
2026-02-02 Sascha Rogmannspec : various improvements ton ngram-map + docs (...
2026-02-02 Nikhil JainRemove pipeline cache mutexes (#19195)
2026-02-01 Max KrasnyanskyBump cmake max version (needed for Windows on Snapdrago...
2026-02-01 Alexis Williamsnix: fix allowUnfreePredicate for packages with multipl...
2026-02-01 Neo Zhangcreate test.sh to enhance the parameters for testing...
2026-01-31 Matthieu Coudronnix: fix nix develop .#python-scripts (#19218)
2026-01-31 nullnameggml-hexagon: flash-attention and reduce-sum optimizati...
2026-01-31 EugeoSynthesisThirtyTwoquantize: add option --tensor-type-file to llama-quanti...
2026-01-30 tc-mbmtmd: support MiniCPM-o 4.5(vision only) (#19211)
2026-01-30 Daniele Pinnalookup, lookahead: fix crash when n_ctx not specified...
2026-01-30 Georgi Gerganovngram-mod : fix build [no ci] (#19216)
2026-01-30 shaofeiqiopencl: add optimized q8_0 mm kernel for adreno (#18871)
2026-01-30 Georgi Gerganovsync : ggml
2026-01-30 Georgi Gerganovcuda : fix compile warnings (whisper/0)
2026-01-30 Georgi Gerganovserver : wrap around the "id_slot" parameter (#19207)
2026-01-30 Simon RedmanCorrectly fetch q8_1 quantize pipeline in test as neede...
2026-01-30 Georgi Gerganovspec : add ngram-mod (#19164)
2026-01-30 Marcello Serijinja : add unordered_map include to value.h [no ci...
2026-01-30 Daniel Beveniusmemory : clarify comments for r_l and s_l tensors ...
2026-01-30 Georgi Gerganovtests : add GQA=20 FA test (#19095)
2026-01-30 Daniel Beveniusconvert : add missing return statement for GraniteMoeMo...
2026-01-30 Daniel Beveniusmemory : remove unused tmp_buf (#19199)
2026-01-30 Antonis Makropoulosdocs: Add LlamaLib to UI projects (#19181)
2026-01-30 bssrdfadd tensor type checking as part of cuda graph properti...
2026-01-30 s8322sycl: implement GGML_UNARY_OP_SOFTPLUS (#19114)
2026-01-30 RachelMantelsycl: implement GGML_OP_TRI (#19089)
2026-01-30 DDXDBFix typos in SYCL documentation (#19162)
2026-01-29 Zheyuan Chenggml-webgpu: improve flastAttention performance by...
2026-01-29 Todor Boinovskihexagon: enable offloading to Hexagon on Windows on...
2026-01-29 Georgi Gerganovcuda : fix nkvo, offload and cuda graph node properties...
2026-01-29 Aldehir Rojaschat : add parsing for solar-open-100b (#18540)
2026-01-29 Andrew Marshallwebui: Update Svelte to fix effect_update_depth_exceede...
2026-01-29 Sigbjørn Skjæretjinja : do not pass empty tools and add some none filte...
2026-01-29 yuloHIP: add mmf for CDNA (#18896)
2026-01-29 Georgi Gerganovarg : add -kvu to llama-batched-bench (#19172)
2026-01-29 Vishal Singhggml-zendnn : resolve ZenDNN backend cross-module symbo...
2026-01-29 Aman GuptaCUDA: refactor topk-moe to enable more models (GLM...
2026-01-29 Neo Zhangsycl: fix norm kernels: l2_norm, group_norm, rms_norm...
2026-01-28 Sigbjørn Skjæretci : find latest release with asset for winget (#19161)
2026-01-28 Ruben OrtlamVulkan Flash Attention Coopmat1 Refactor (#19075)
2026-01-28 Sascha Rogmannspec : add self‑speculative decoding (no draft model...
2026-01-28 Daniel Beveniusconvert : yield Mamba2Model/GraniteMoeModel modify_tens...
2026-01-28 Patryk Kaminskiggml-sycl: remove unused syclcompat header (#19140)
2026-01-28 Sigbjørn Skjæretjinja : undefined should be treated as sequence/iterabl...
2026-01-28 Oleksandr Kuvshynovvulkan: handle device dedup on MacOS + Vega II Duo...
2026-01-28 Ben Chendoc: add build instruction to use Vulkan backend on...
2026-01-28 Kevin Pougetggml: new backend for Virglrenderer API Remoting accele...
2026-01-28 Alberto Cabrera... ggml-cpu: arm64: Q4_K scale unroll and vectorization...
2026-01-28 Georgi Gerganovcuda : fix "V is K view" check for non-unified KV cache...
2026-01-28 Georgi GerganovCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-28 Georgi Gerganovserver : adjust spec tests to generate up to 16 tokens...
2026-01-28 Georgi Gerganovllama : disable Direct IO by default (#19109)
2026-01-28 Daniel Beveniussampling : remove sampling branching in output_reserve...
2026-01-28 Nikhil Jainggml webgpu: Split shared state (webgpu_context) into...
2026-01-27 Vishal Singhggml-zendnn : update ZenDNN git tag to main branch...
2026-01-27 Sigbjørn Skjæretjinja : implement mixed type object keys (#18955)
2026-01-27 David Limadocs: Remove duplicated word on CUDA build section...
2026-01-27 Johannes GäßlerCUDA: tune GLM 4.7 Flash FA kernel selection logic...
2026-01-27 Sigbjørn Skjæretci : revert slim runner for winget (#19129)
2026-01-27 Alberto Cabrera... ggml-cpu: aarm64: q6_K repack gemm and gemv (and generi...
2026-01-27 Gaurav Garg[CUDA] Reduce CPU-side stalls due to the CUDA command...
2026-01-27 Daniel Beveniuscommon : clarify HTTPS build options in error message...
2026-01-27 shalinib-ibmggml-cpu: Enable FP16 MMA kernels on PPC (#19060)
2026-01-27 lhezopencl: add flattened q6_K mv (#19054)
2026-01-26 Johannes GäßlerCUDA: fix padding of GQA to power of 2 in FA (#19115)
2026-01-26 Georgi Gerganovgraph : fix nkvo offload with FA (#19105)
2026-01-26 Sigbjørn Skjæretci : use new 1vCPU runner for lightweight jobs (#19107)
2026-01-26 Georgi Gerganovmodel : add correct type for GLM 4.7 Flash (#19106)
2026-01-25 Johannes GäßlerCUDA: faster FA for GQA > 1 but not power of 2 (#19092)
next