]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2025-05-11 Anthony Umfertools : fix uninitialized llama_batch in server (#13436)
2025-05-11 Sigbjørn Skjæretscripts : exit compare-llama-bench.py gracefully when...
2025-05-11 Johannes GäßlerCUDA: fix crash with partial offloading of MoE (#13439)
2025-05-11 David HuangAdd `--no-op-offload` to improve `-ot` pp perf in MoE...
2025-05-11 Citymtmd : support InternVL 3 38B and 78B mmproj (#13443)
2025-05-11 Xuan-Son Nguyenmtmd : move helpers to dedicated file (#13442)
2025-05-10 Thomas Germerdocs : Fix typo in InternVL3 model name (#13440)
2025-05-10 Johannes GäßlerCUDA: fix race conditions FlashAttention kernels (...
2025-05-10 Sigbjørn Skjæretvocab : add ByteDance-Seed/Seed-Coder (#13423)
2025-05-10 Xuan-Son Nguyenmtmd : add hard limit on image resolution for qwen2vl...
2025-05-10 Xuan-Son Nguyenserver : update docs (#13432)
2025-05-10 Sigbjørn Skjæretllguidance : set tokenizer slices to default (#13424)
2025-05-10 Thammachart... ci: free_disk_space flag enabled for intel variant...
2025-05-10 Xuan-Son Nguyenmtmd : support InternVL 2.5 and 3 (#13422)
2025-05-10 Johannes GäßlerCUDA: fix FlashAttention on Turing (#13415)
2025-05-10 Xuan-Son Nguyenarg : add env var to control mmproj (#13416)
2025-05-10 Jeff Bolzvulkan: scalar flash attention implementation (#13324)
2025-05-09 Helton Reischore(llguidance): use tagged version that does not...
2025-05-09 Xuan-Son Nguyen server : vision support via libmtmd (#12898)
2025-05-09 Alberto Cabrera... sycl : implementation of reordered Q4_0 MMVQ for Intel...
2025-05-09 Georgi Gerganovmetal : optimize MoE for large batches (#13388)
2025-05-09 Johannes GäßlerCUDA: FA support for Deepseek (Ampere or newer) (#13306)
2025-05-09 Diego Devesallama : do not crash if there is no CPU backend (#13395)
2025-05-09 Johannes GäßlerCUDA: fix crash on large batch size for MoE models...
2025-05-09 Bartowskiimatrix : Add --parse-special for enabling parsing...
2025-05-09 R0CKSTARllama-run: add support for downloading models from...
2025-05-09 Xuan-Son Nguyenmtmd : fix batch_view for m-rope (#13397)
2025-05-09 Xuan-Son Nguyenllama : one-off chat template fix for Mistral-Small...
2025-05-09 Radoslav Gerganovrpc : add rpc_msg_set_tensor_hash_req (#13353)
2025-05-09 Jeff Bolzvulkan: Allow up to 4096 elements for mul_mat_id row_id...
2025-05-09 Xuan-Son Nguyenserver : (webui) rename has_multimodal --> modalities...
2025-05-08 Diego Devesaci : limit write permission to only the release step... upstream/0.0.5318
2025-05-08 Matt Claytonmtmd : Expose helper_decode_image_chunk (#13366)
2025-05-08 Xuan-Son Nguyenserver : (webui) fix a very small misalignment (#13387)
2025-05-08 Xuan-Son Nguyenserver : (webui) revamp the input area, plus many small...
2025-05-08 Sigbjørn Skjæretconvert : support rope_scaling type and rope_type ...
2025-05-08 welixmtmd : fix the calculation of n_tokens for smolvlm...
2025-05-08 Georgi Gerganovcontext : allow cache-less context for embeddings ...
2025-05-08 Georgi Gerganovcontext : remove logits_all flag (#13284)
2025-05-08 Diego Devesaci : move release workflow to a separate file (#13362)
2025-05-08 Diego Devesallama : print size and type of overridden tensors ...
2025-05-08 Alberto Cabrera... sycl: addressing non-contiguous src1 mul_mats (nc and...
2025-05-07 Diego Devesadocker : disable arm64 and intel images (#13356)
2025-05-07 Georgi Gerganovsync : ggml
2025-05-07 Daniel Beveniuswhisper: remove MSVC warnings pragmas (whisper/3090)
2025-05-07 Jared Tweedcmake : removed stdc++fs (whisper/3097)
2025-05-07 Sigbjørn Skjæretllama : deci : support ffn-free with attention (#13296)
2025-05-07 Ycroscommon : Add a warning when we can't match samplers...
2025-05-07 R0CKSTARcuda : remove nrows_x in mul_mat_q_process_tile (#13325)
2025-05-07 Georgi Gerganovexamples : remove infill (#13283)
2025-05-07 piDackllama : support tie embedding for chatglm models (...
2025-05-06 Johannes GäßlerCUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF...
2025-05-06 Xuan-Son Nguyenclip : refactor graph builder (#13321)
2025-05-06 DocShotgunsampling : make top_n_sigma no-op at <=0 or a single...
2025-05-06 oobaboogasampling : don't consider -infinity values in top_n_sig...
2025-05-06 Diego Devesacmake : remove arm64 msvc presets (#13342)
2025-05-06 Akarshan BiswasSYCL: Disable reorder optimize by default and stop...
2025-05-06 Xuan-Son Nguyenllama : fix build_ffn without gate (#13336)
2025-05-06 Johannes GäßlerCUDA: fix bad asserts for partial offload (#13337)
2025-05-06 Sigbjørn Skjæretconvert : qwen2/3moe : set yarn metadata if present...
2025-05-06 Johannes GäßlerCUDA: fix --split-mode row for MMQ (#13323)
2025-05-06 compiladegguf-py : avoid requiring pyside6 for other scripts...
2025-05-05 Johannes GäßlerCUDA: fix logic for clearing padding with -ngl 0 (...
2025-05-05 oobaboogasampling : Integrate Top-nσ into main sampling chain...
2025-05-05 igardevserver : Webui - change setText command from parent...
2025-05-05 Xuan-Son Nguyenmtmd : rename llava directory to mtmd (#13311)
2025-05-05 Xuan-Son Nguyenclip : fix confused naming ffn_up and ffn_down (#13290)
2025-05-05 Sigbjørn Skjæretconvert : bailingmoe : set yarn metadata if present...
2025-05-05 Akarshan BiswasSYCL: Disable mul_mat kernels for noncontiguous tensor...
2025-05-04 Xuan-Son Nguyenmtmd : add C public API (#13184)
2025-05-04 Diego Devesarpc : use backend registry, support dl backends (#13304)
2025-05-04 Aaron Teoggml : activate s390x simd for Q3_K (#13301)
2025-05-04 Diego Devesallava/mtmd : fixes to fully support dl backends (#13303)
2025-05-04 Diego Devesallama : build windows releases with dl backends (#13220)
2025-05-04 Johannes GäßlerCUDA: fix race condition in MMQ stream-k fixup (#13299)
2025-05-04 Johannes GäßlerCUDA: fix race condition in MMQ ids_dst (#13294)
2025-05-04 Jeff Bolzvulkan: Additional type support for unary, binary,...
2025-05-03 Johannes Gäßlerimatrix: fix oob writes if src1 is not contiguous ...
2025-05-03 Xuan-Son Nguyenclip : revert the change of BOI/EOI token for GLM-edge...
2025-05-03 ymckillama : Llama-3_1-Nemotron-Ultra-253B-v1 support (...
2025-05-02 Diego Devesallama : move end-user examples to tools directory ...
2025-05-02 Georgi Gerganovsync : ggml (#13268)
2025-05-02 Georgi Gerganovcontext : fix reorder logic (#13267)
2025-05-02 shalinib-ibmggml : Enable MMA for BF16 in llamafile_sgemm (#13148)
2025-05-02 Jared Van Bortelllama-model : support Qwen2 embedding models and poolin...
2025-05-02 Jared Van Bortelconvert : use correct context length for nomic-embed...
2025-05-02 Xuan-Son Nguyenconvert : converting mmproj for Qwen2/2.5VL from conver...
2025-05-02 Georgi Gerganovkv-cache : separate recurrent vs non-recurrent impl...
2025-05-02 Sigbjørn Skjæretllama : orion rope type is neox (#13261)
2025-05-02 Sigbjørn Skjæretllama : plamo rope type is neox (#13260)
2025-05-02 piDackllama-chat : reset glmedge chat template (#13253)
2025-05-02 Shakil Ahmedmtmd-cli : fix out_of_range when input image path is...
2025-05-02 Georgi Gerganovserver : add cache reuse card link to help (#13230)
2025-05-02 Xuan-Son Nguyenconvert : explicitly disable trust_remote_code for...
2025-05-01 bandotici: fix cross-compile sync issues (#12804)
2025-05-01 Justin Santa... rpc : avoid uninitialized memory in serialize_tensor...
2025-05-01 Jesse Grossggml: Don't assert fail when tensor data changes (...
2025-05-01 Diego Devesabuild : fix build info on windows (#13239)
2025-05-01 Loïc Carrèreclip : (minicpmv) Re-enable upscaling of images smaller...
2025-05-01 matteollama-chat : update GLM4 chat template (#13238)
next