]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2025-08-04 compiladeimatrix : warn when GGUF imatrix is saved without ...
2025-08-04 Christian Kastnercmake: Add GGML_BACKEND_DIR option (#15074)
2025-08-04 Sigbjørn Skjæretgguf-py : add --chat-template-file to gguf_new_metadata...
2025-08-04 Sammodel: support GLM 4.5 family of models (#14939)
2025-08-04 Sigbjørn Skjæretquantize : fix confusing error message if ftype is...
2025-08-04 Reese Levineggml: WebGPU backend host improvements and style fixing...
2025-08-04 Jeff Bolzvulkan: fix build when using glslang that does not...
2025-08-03 compiladeimatrix : use GGUF by default (#14842)
2025-08-03 compiladeimatrix : fix 3d activation handling for hybrid and...
2025-08-03 compiladememory : handle kv_unified for hybrid models (#15050)
2025-08-03 Csaba Kecskemetivocab : JetBrains Mellum pre-tokenizer (#15045)
2025-08-03 Gabriel Larsonmodel : add text-only support for Kimi-VL (and find...
2025-08-03 Jeff Bolzvulkan: Use coopmat2 for conv2d (#14982)
2025-08-02 lhezopencl: fix adreno compiler detection logic (#15029)
2025-08-02 Johannes GäßlerCUDA: use mma FA kernel for gqa > 4 on RTX 4000 (#15035)
2025-08-02 leejetcuda: make im2col a little faster (#15025) upstream/0.0.6073
2025-08-02 Daniel Beveniuskv-cache : skip alignment of n_stream in kv-cache log...
2025-08-02 Georgi Gerganovllama : enable LLAMA_SET_ROWS=1 by default (#14959)
2025-08-02 Georgi Gerganovcuda, sycl : fix batched gemm when ne02 == 1 && ne03...
2025-08-02 Sigbjørn Skjæretci : check that pre-tokenizer hashes are up-to-date...
2025-08-02 Douglas Hanleyconvert : fix Qwen3-Embedding pre-tokenizer hash (...
2025-08-02 Jhen-Jie Hongchat : fix multiple tool_calls on hermes-2-pro (#14962)
2025-08-02 Jeff Bolzvulkan: coopmat2 mul_mat optimizations (#14934)
2025-08-02 R0CKSTARllama-bench: rename DB table name from test to llama_be...
2025-08-02 Jeff Bolzvulkan: Support ne[3]>1 in noncontig matrix-vector...
2025-08-02 Douglas Hanleymodel : support Qwen3-Embedding (#15023)
2025-08-02 Johannes Gäßlerserver: enable token array inputs for OAI API (#15001)
2025-08-02 Jeff Bolzvulkan: optimizations for direct convolution (#14933)
2025-08-01 Johannes GäßlerCUDA: fix MMQ nwarps for AMD with warp_size==32 (#15014)
2025-08-01 l-austenfeldvendor : update vendored copy of google/minja (#15011)
2025-08-01 stevenkuangmodel : add hunyuan dense (#14878)
2025-08-01 lhezopencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)
2025-08-01 Srihari-mcwggml : Q2k interleaving implementation - x86/x64 SIMD...
2025-08-01 Georgi Gerganovgraph : fix equal_seq() check (#14986)
2025-08-01 diannaodocker : add cann build pipline (#14591)
2025-08-01 R0CKSTARcompare-commits.sh: support both llama-bench and test...
2025-07-31 Ed Addarioquantize : skip tensor override when in fallback mode...
2025-07-31 Diego Devesallama : add simple option to enable CPU for MoE weights...
2025-07-31 Aman GuptaFix params bug in diffusion example (#14993)
2025-07-31 Diego Devesallama : allow other bufts when overriding to CPU, add...
2025-07-31 Ruben OrtlamVulkan: Fix minor debug mode issues (#14899)
2025-07-31 tc-mbmtmd : support MiniCPM-V 4.0 (#14983)
2025-07-31 Csaba KecskemetiMODEL_TENSOR.SSM_DT_NORM has defined twice (#14991)
2025-07-31 g2mtserver : implement universal assisted decoding (#12635)
2025-07-31 Dongliang Weillama : merge build_moe_ffn_from_probs function into...
2025-07-31 Lukas Straubserver : add openai-style logit_bias support (#14946)
2025-07-31 Aman GuptaAdd LLaDA 8b Diffusion model (#14771)
2025-07-31 hipuddingCANN: Improve loading efficiency after converting weigh...
2025-07-31 compiladegraph : reduce splits for recurrent and hybrid models...
2025-07-30 lhezopencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f3...
2025-07-30 Ed Addarioquantize : fix using combined imatrix GGUFs (multiple...
2025-07-30 Daniel Beveniusserver : add support for `embd_normalize` parameter...
2025-07-30 uvosHIP: enable mfma mmq on gfx908 and gfx90a for select...
2025-07-30 Georgi Gerganovsync : ggml
2025-07-30 Kai Pastorcmake : Fix BLAS link interface (ggml/1316)
2025-07-30 Kai Pastorvulkan : fix 32-bit builds (ggml/1313)
2025-07-30 Johannes GäßlerCUDA: skip masked KV slices for all FA kernels (#14924)
2025-07-30 Georgi Gerganovtests : update for LLAMA_SET_ROWS=1 (#14961)
2025-07-30 Georgi Gerganovgraph : fix stack-use-after-return (#14960)
2025-07-30 Douglas Hanleyembeddings: fix extraction of CLS pooling results ...
2025-07-30 Xinpeng DouCANN: update ops docs (#14935)
2025-07-29 uvosHIP: remove the use of __HIP_PLATFORM_AMD__, explicitly...
2025-07-29 uvosHIP: add GGML_HIP_MMQ_MFMA option to allow disableing...
2025-07-29 uvosHIP: Ignore unsupported unroll transformation in fattn...
2025-07-29 kallewoofcommon : avoid logging partial messages (which can...
2025-07-29 hipuddingCANN: Add ggml_set_rows (#14943)
2025-07-29 Sigbjørn Skjæretcuda : add softcap fusion (#14907)
2025-07-29 Johannes Gäßlerserver-bench: make seed choice configurable (#14929)
2025-07-29 Aman GuptaCUDA: add roll (#14919)
2025-07-28 lhezopencl : add ops docs (#14910)
2025-07-28 Leonard Mosescutest-backend-ops : extend test case filtering (#14865)
2025-07-28 Radoslav Gerganovllama-bench : use local GPUs along with RPC servers...
2025-07-28 xctanggml-cpu : deduplicate scalar implementations (#14897)
2025-07-28 Akarshan BiswasSYCL: Add set_rows support for quantized types (#14883)
2025-07-28 Xuan-Son Nguyenmtmd : add support for Voxtral (#14862)
2025-07-28 Johannes GäßlerCUDA: fix pointer incrementation in FA (#14916)
2025-07-28 Dongliang Weimodel : add support for SmallThinker series (#14898)
2025-07-28 Alberto Cabrera... sycl: refactor quantization to q8_1 (#14815)
2025-07-28 Georgi Gerganovops : update BLAS (#14914)
2025-07-28 Georgi Gerganovops : update Metal (#14912)
2025-07-28 Georgi Gerganovsync : ggml
2025-07-28 Kai Pastorcmake : Indent ggml-config.cmake (ggml/1310)
2025-07-27 Ed Addarioquantize : update README.md (#14905)
2025-07-27 Ruben Ortlamvulkan: add ops docs (#14900)
2025-07-27 Akarshan BiswasSYCL: add ops doc (#14901)
2025-07-27 Daniel Beveniusllama : clarify comment about pp and tg graphs [no...
2025-07-27 Erik Scholzvulkan : add fp16 support for the conv_2d kernel (...
2025-07-27 Jeff Bolzvulkan: skip empty set_rows to avoid invalid API usage...
2025-07-27 Gabriel Larsonmodel : make rope_yarn_log_mul optional for deepseek2...
2025-07-27 Shunta Saitollama : fix kq_scale for the attention layers of PLaMo2...
2025-07-27 Aman GuptaDocs: add instructions for adding backends (#14889)
2025-07-26 deepsekHIP: Enable Matrix cores for MMQ Kernels, Enable stream...
2025-07-26 hipuddingCANN: Implement GLU ops (#14884)
2025-07-26 R0CKSTARmusa: fix build warnings (unused variable) (#14869)
2025-07-25 Aaron Teoggml-cpu : disable GGML_NNPA by default due to instabil...
2025-07-25 Gabe Goodhartmetal: SSM_SCAN performance (#14743)
2025-07-25 lhezopencl: add fused `rms_norm_mul` (#14841)
2025-07-25 wooksongdocs : update HOWTO‑add‑model.md for ModelBase and...
2025-07-25 Oliver Simonsggml : remove invalid portPos specifiers from dot files...
2025-07-25 Georgi Gerganovcontext : restore preemptive sched reset when LLAMA_SET...
next