]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-01-22 Georgi Gerganovmla : make the V tensor a view of K (#18986)
2026-01-22 Johannes GäßlerCUDA: fix alignment check for FA (#19023)
2026-01-22 Aman Guptaconvert_hf_to_gguf.py: refactor modify_tensors to call...
2026-01-22 lhezopencl: enable the general fp mm for non-cont input...
2026-01-22 Xuan-Son Nguyenserver: do not log certain endpoints (avoid log spam...
2026-01-22 Georgi Gerganovquant : manual overrides of tensor types take precedenc...
2026-01-22 Aaron Teorelease: update github api (#19022)
2026-01-22 Xuan-Son Nguyenmtmd : update docs to use llama_model_n_embd_inp (...
2026-01-22 손희준server: Reorder methods in `server-task.cpp` (#19016)
2026-01-22 Aman GuptaCUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953)
2026-01-22 shaofeiqiopencl: add TRI op support (#18979)
2026-01-22 Aleksei Nikiforovggml-zdnn : mark zDNN buffers as non-host (#18967)
2026-01-21 Pádraic Slatteryci : update GitHub Actions versions [no ci] (#18935)
2026-01-21 Mariusz Woloszynconvert : add Devstral-2 (Ministral3ForCausalLM) arch...
2026-01-21 Piotr Wilkin... jinja: support none|string (#18995)
2026-01-21 Hendrik Erzfix: Use `tabular-nums` for chat message statistics...
2026-01-21 Daniel Beveniusllama : clarify nemotron-h.cpp comment about RoPE ...
2026-01-21 Jeff Bolzvulkan: Remove transfer_ctx, do everything in compute_c...
2026-01-21 Adrien Gallouëtcommon : improve error message when HTTPS is missing...
2026-01-21 손희준server: /v1/responses (partial) (#18486)
2026-01-21 Jeff Bolzvulkan: support flash attention GQA/split_k with small...
2026-01-21 Masato NakasakaRevert "vulkan: force full subgroups for flash attentio...
2026-01-21 Jeff Bolzvulkan: Use mul_mat_vec_id for small values of n (...
2026-01-21 Tarek Dakhranmemory : add llama_memory_hybrid_iswa (#18601)
2026-01-21 Piotr Wilkin... Fix GLM 4.7 Lite MoE gating func (#18980)
2026-01-21 Matthieu Coudrongguf: display strerrno when cant load a model (#18884)
2026-01-21 Oliver SimonsCUDA: Fix builds for older CCCL versions by ifdefing...
2026-01-20 Adrien Gallouëtcommon, server : use the same User-Agent by default...
2026-01-20 Xuan-Son Nguyencli : fix reasoning responses in CLI (#18961)
2026-01-20 Oliver SimonsCUDA: Replace init_offsets kernel with iterators in...
2026-01-20 Adrien Gallouëtggml : cleanup path_str() (#18928)
2026-01-20 Georgi Gerganovmetal : enable FA for MLA heads (#18950)
2026-01-20 Daniel Beveniusconvert : use n_groups instead of hardcoded values...
2026-01-19 Xuan-Son Nguyenserver : refactor oai_parser_opt, move it to server_cha...
2026-01-19 ddh0convert : support Glm4MoeLite (#18936)
2026-01-19 Sigbjørn Skjæretjinja : fix undefined keys and attributes and int/float...
2026-01-19 Sigbjørn Skjæretci : run test-jinja -py on high perf [no ci] (#18916)
2026-01-19 Lennart Austenfeldserver: fix memory reservations in populate_token_probs...
2026-01-19 Georgi Gerganovggml : add ggml_build_forward_select (#18550)
2026-01-19 Daniel Beveniusmodel-conversion : add BUILD_DIR variable to run-conver...
2026-01-18 Julius Tischbeinllama : Extend fallback, fix fileno for dio file, exclu...
2026-01-18 Francisco Herreradocs: add linux to index (#18907)
2026-01-18 Xuan-Son Nguyentests : add test-jinja -py option for cross-checking...
2026-01-18 Sigbjørn Skjæretjinja : fix object item order (and properly implement...
2026-01-18 Sigbjørn Skjæretjinja : attribute support for join, map and sort (...
2026-01-18 Sigbjørn Skjæretjinja : add missing tojson filter for bool (#18900)
2026-01-17 Sigbjørn Skjæretjinja : fix lexing of float literals with sign (#18901)
2026-01-17 Xuan-Son Nguyenjinja: correct member access rule (#18905)
2026-01-17 lhezopencl: fix q6_K mv for m=1 (#18893)
2026-01-17 Sigbjørn Skjæretci : add label for jinja changes (#18903)
2026-01-17 Georgi Gerganovkv-cache : optimize KQ mask construction (#18842)
2026-01-17 Reese Levineggml webgpu: support for backend sampling (#18880)
2026-01-16 Thore Koritziusggml : extend ggml_pool_1d + metal (#16429)
2026-01-16 hipuddingdocs : update ops.md for CANN backend (#18654)
2026-01-16 Perry Naseckggml-blas: hide warnings from included BLAS headers...
2026-01-16 Tarek Dakhranmtmd : Fix ASR for LFM2.5-Audio-1.5B (#18876)
2026-01-16 Xuan-Son Nguyencommon : implement new jinja template engine (#18462)
2026-01-16 Julius TischbeinSetting mmap and direct_io to false as default in llama...
2026-01-16 Raul TorresCANN: Remove unused `ggml_cann_get_device` function...
2026-01-16 Chenguang LiCANN: fix an issue where get_env was not fully renamed...
2026-01-16 hipuddingCANN: support gated linear attn (#18653)
2026-01-15 shaofeiqiOpenCL: add SOLVE_TRI op support (#18846)
2026-01-15 Georgi Gerganovcuda : print less debug logs when disabling cuda graphs...
2026-01-15 Georgi Gerganovcontext : do not reserve scheduler for warmups (#18867)
2026-01-15 ddh0llama : add adaptive-p sampler (#17927)
2026-01-15 Xuan-Son Nguyenserver: improve slots scheduling for n_cmpl (#18789)
2026-01-15 Georgi Gerganovcontext : reserve new scheduler when graph topology...
2026-01-15 Johannes GäßlerCUDA: fix allignment on register spill for FA (#18815)
2026-01-15 shalinib-ibmggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837)
2026-01-15 Xuan-Son Nguyenlora: make sure model keep track of associated adapters...
2026-01-15 Sigbjørn Skjæretmodel-loader : support bool array sliding window patter...
2026-01-15 Adrien Gallouëttests : download models only when running ctest (#18843)
2026-01-15 Max Krasnyanskyhexagon: support for OP_CPY, host buffers now optional...
2026-01-15 Oliver SimonsCUDA: Factor out and re-use `block_reduce` function...
2026-01-14 Piotr Wilkin... Restore clip's cb() to its rightful glory - extract...
2026-01-14 Junwon Hwangmodel : clean up and fix EXAONE-MoE configuration ...
2026-01-14 Adrien Gallouëtrefactor : remove libcurl, use OpenSSL when available...
2026-01-14 Jeff Bolzvulkan: Check maxStorageBufferRange in supports_op...
2026-01-14 Aman Guptallama-model: fix unfortunate typo (#18832)
2026-01-14 Daniel BeveniusCUDA : fix typo in clang pragma comment [no ci] (#18830)
2026-01-14 Ruben Ortlamvulkan: work around Intel fp16 bug in mmq (#18814)
2026-01-14 Perry Naseckggml-metal: do not copy headers for embedded, use curre...
2026-01-14 Daniel Benjaminssonmmap: add Haiku support by skipping RLIMIT_MEMLOCK...
2026-01-14 Adrien Gallouëtci, tests : use cmake to download models and remove...
2026-01-13 ddh0llama : print_info alignment fix (#18708)
2026-01-13 Junwon Hwangmodel : add EXAONE MoE (#18543)
2026-01-13 Georgi Gerganovvocab : fix attribute overrides for harmony (#18806)
2026-01-13 Ruben Ortlamllama-mmap: fix direct-io loading fallback EOF exceptio...
2026-01-13 Daniel Beveniusmodel-conversion : remove -c 0 from model card template...
2026-01-13 yuloHIP: add fattn-mma-f16 for RDNA4 (#18481)
2026-01-13 Johannes Gäßlerdoc: ban AI-generated PR descriptions [no ci] (#18765)
2026-01-13 Xuan-Son Nguyenmtmd: fix use_non_causal being reported incorrectly... upstream/0.0.7721
2026-01-13 Georgi GerganovCUDA : fix unused argument when USE_CUDA_GRAPH=OFF...
2026-01-13 Gabe Goodhartgraph : clean up t5 input builders (#18795)
2026-01-13 Ruben Ortlamllama-bench: add direct_io parameter (#18778)
2026-01-12 Adrien Gallouëtci : remove libcurl in releases (#18775)
2026-01-12 Radoslav Gerganovserver : add arg for disabling prompt caching (#18776)
2026-01-12 Adrien Gallouëtci : use openssl for openEuler-latest-cmake-cann (...
2026-01-12 Adrien Gallouëtvendor : update cpp-httplib to 0.30.1 (#18771)
2026-01-12 Daniel Beveniusexamples : add --kv-unified to batched example (#18774)
next