]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-01-06 Tarek Dakhranmtmd: mtmd_audio_streaming_istft (#18645)
2026-01-06 Johannes Gäßlerllama-params-fit: fix last devices with low VRAM (...
2026-01-06 Aadeshveer... ggml : optimize cuda ssm_scan using warp-level reductio...
2026-01-06 Xuan-Son Nguyenarg: use CSV escape style for multiple-value args ...
2026-01-06 Jeff Bolzvulkan: support buffer_from_host_ptr (#18467)
2026-01-06 Aman Guptaggml-cuda: refactor cuda graph usage (#18637)
2026-01-06 Beinseziimmq.cu: tune mmq/rocblas switching for RDNA (#18537)
2026-01-06 Rserver : add thinking content blocks to Anthropic Messa...
2026-01-06 Christian Kastnergguf-py : add requests to dependencies (#18629)
2026-01-06 Adrien Gallouëtggml : fix avx512bf16 build (#18623)
2026-01-06 Raul TorresCANN: Make `valid_values` variable `static const` ...
2026-01-05 nwyinggml webgpu: add CEIL operation support (#18605)
2026-01-05 Tarek Dakhranmodel : add LFM2-ColBert-350M (#18607)
2026-01-05 Johannes GäßlerCUDA: fix FA FP16 accumulator overflow for Granite...
2026-01-05 ttadd YoutuVLForConditionalGeneration architectures ...
2026-01-05 Aman Guptaggml-cuda: check for srcs outside the cgraph (#18583)
2026-01-05 Vladislav Sayapinserver : fix router child env in containerized environm...
2026-01-05 Jeff Bolzvulkan: fix topk_moe_sigmoid_norm_bias failures in...
2026-01-05 Georgi Gerganovmodels : fix backend assignment for Granite/Nemotron...
2026-01-05 Jeff Bolzvulkan: handle quantize_q8_1 overflowing the max workgr...
2026-01-05 Sigbjørn Skjæretllama : refactor rope_freq_base/scale_swa conversion...
2026-01-05 Chenguang LiCANN: add operator fusion support for ADD + RMS_NORM...
2026-01-05 Francisco Herreradoc: clarify that steps also apply to linux for opencl...
2026-01-05 Ali Tariqci : init git lfs in every build for RISC-V (#18590)
2026-01-04 Daniel Beveniussampling : add support for backend sampling (#17004)
2026-01-04 Tarek Dakhranmodel : mtmd : make input norm optional in LFM2-VL...
2026-01-04 Aman GuptaCUDA: disable cuda graph when using n-cpu-moe (#18593)
2026-01-04 Aman Guptaggml-cuda: remove unused params in ggml_cuda_graph...
2026-01-03 Aldehir Rojascommon/grammar : replace problematic backtracking regex...
2026-01-03 Georgi Gerganovgraph : fix graph reuse logic when `n_pos_per_embd...
2026-01-03 Aman Guptaggml-cuda: fixes for concurrent streams (#18496)
2026-01-03 Georgi Gerganovcontext : fix reserve token padding to n_seqs (#18536)
2026-01-03 Johannes GäßlerCUDA: only allocate FA tmp buffer if needed (#18564)
2026-01-03 pl752(Bugfix, ggml-cuda) Pool alloc count fix + small size...
2026-01-03 Shouyuggml-hexagon: optimize activation function (#18393)
2026-01-02 Jeff Bolzvulkan: Optimize GGML_OP_CUMSUM (#18417)
2026-01-02 Jeff Bolzvulkan: Implement mmvq for iq1_s/iq1_m (#18450)
2026-01-02 Prabodmodel : Maincoder-1B support (#18534)
2026-01-02 Georgi Gerganovmetal : adjust extra size for FA buffer to avoid reallo...
2026-01-02 Georgi Gerganovgraph : reduce topology branching (#18548)
2026-01-02 Georgi Gerganovvocab : reduce debug logs about non-EOG control tokens...
2026-01-02 Chris Rohlfrpc : use unordered_map::reserve and emplace (#18513)
2026-01-01 MeeMincuda : fix copy of large tensors (ggml_nbytes <= INT_MA...
2026-01-01 Sigbjørn Skjæretmodel : remove modern-bert iswa template (#18529)
2026-01-01 ttmodel: support youtu-vl model (#18479)
2026-01-01 Piotr Wilkin... Add conversion support for IQuestCoderForCausalLM ...
2026-01-01 o7simodel : add support for JinaBertModel with non-gated...
2026-01-01 o7siconvert : fix encoding of WPM vocab for BERT models...
2026-01-01 HelloKSmodel: add Solar Open model (#18511)
2026-01-01 Anri Lombardwebui: fix code copy stripping XML/HTML tags (#18518)
2026-01-01 Aman Guptaggml-cuda: remove unneccesary prints on ggml_cuda_init...
2026-01-01 Jeff Bolzvulkan: extend topk_moe to handle sigmoid w/exp_probs_b...
2026-01-01 triplenomllama: handle short reads in direct I/O path (#18504) upstream/0.0.7599
2025-12-31 Anri Lombardchat: make tool description and parameters optional...
2025-12-31 Georgi Gerganovsync : ggml
2025-12-31 Georgi Gerganovggml : bump version to 0.9.5 (ggml/1410)
2025-12-31 Anri Lombardquantize: prevent input/output file collision (#18451)
2025-12-31 Sigbjørn Skjæretconvert : lint fix (#18507)
2025-12-31 Henry147147mtmd : Adding support for Nvidia Music Flamingo Model...
2025-12-31 gatbontonpcmetal : add count_equal op (#18314)
2025-12-31 Johannes GäßlerCUDA: fix KQ max calculation (#18487)
2025-12-31 Georgi Gerganovmetal : remove BF16 x F16 kernels (#18456)
2025-12-31 Aman Guptasycl: add newline at the end of CMakeLists.txt (#18503)
2025-12-31 Rahul SatheWork around broken IntelSYCLConfig.cmake in Intel oneAP...
2025-12-30 Sigbjørn Skjæretdocker : add CUDA 13.1 image build (#18441)
2025-12-30 Bart Louwersdocs : document that JSON Schema is not available to...
2025-12-30 Aldehir Rojascommon : default content to an empty string (#18485)
2025-12-30 Daniel Beveniusllama : fix typo in comment in llama-kv-cache.h [no...
2025-12-30 Xuan-Son Nguyenlora: count lora nodes in graph_max_nodes (#18469)
2025-12-30 Jay Zenithsampling: reuse token data buffer in llama_sampler_samp...
2025-12-30 Jeff Bolzserver: fix files built redundantly (#18474)
2025-12-30 Charles Xukleidiai: add and integrate SVE 256-bit vector-length...
2025-12-30 Aman GuptaCUDA: add log line when mxfp4 acceleration is used...
2025-12-30 Daniel Beveniusmodel-conversion : use CONVERTED_MODEL for compare...
2025-12-29 Xuan-Son Nguyenwebui: fix prompt progress ETA calculation (#18468)
2025-12-29 PascalWebui/prompt processing progress (#18300)
2025-12-29 Johannes GäßlerCUDA: fix replacment of bad archs in CMake (#18457)
2025-12-29 wbtekserver : Cmdline arg -to changes http read timeout...
2025-12-29 Xuan-Son Nguyencontributing: tighten AI usage policy (#18388)
2025-12-29 Naco Sirenandroid: routine maintenance - Dec 2025 (#18338)
2025-12-29 Georgi Gerganovserver : handle closed connection for tasks (#18459)
2025-12-29 Daniel Beveniusmodel-conversion : add device option to embd run orig...
2025-12-29 Héctor Estrada... retrieval : use at most n_seq_max chunks (#18400)
2025-12-29 o7sicommon: fix return value check for setpriority (#18412)
2025-12-29 Johannes GäßlerCUDA: Blackwell features for non-native builds (#18436)
2025-12-29 Aman Guptacuda: fix race condition in cumsum (#18448)
2025-12-28 Tim Neumannci : re-enable rocm build on amd64 (#18439)
2025-12-28 uvosHIP: Use mmq on MFMA devices for MUL_MAT_ID in cases...
2025-12-28 momongamodel : Plamo3 support (#17304)
2025-12-28 Aman GuptaRevert "ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if...
2025-12-28 o7sirpc: fix segfault on invalid endpoint format (#18387)
2025-12-28 Johannes Gäßlerllama-fit-params: fix step size for last device (#18415)
2025-12-28 Johannes Gäßlergithub: update issue templates [no ci] (#18410)
2025-12-28 Xuan-Son Nguyenmtmd: clarify that we no longer accept AI-generated...
2025-12-28 Boian Berberovcmake: Added more x86_64 CPU backends when building...
2025-12-28 QDeltaggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when...
2025-12-27 lhezopencl: allow resizing transpose buffers (#18384)
2025-12-27 Johannes Gäßlerllama-fit-params: fix overflow check (#18354)
2025-12-27 Johannes Gäßlerllama: fix magic number of 999 for GPU layers (#18266)
2025-12-27 Aman Guptaggml-cuda: Use same regex for GGML_NATIVE=OFF (#18407)
next