]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-01-09 Georgi Gerganovscripts : pr2wt.sh reset to remote head (#18695)
2026-01-09 Georgi Gerganovserver : use different seeds for child completions...
2026-01-08 Xuan-Son Nguyencommon: support remote preset (#18520)
2026-01-08 Aaron Teollama: use host memory if device reports 0 memory ...
2026-01-08 Masashi Yoshimuraggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten...
2026-01-08 Reese Levineggml webgpu: initial flashattention implementation...
2026-01-08 Jeff Bolzvulkan: fix push constant size for quantize_q8_1 (...
2026-01-08 Jeff Bolzvulkan: optimize ssm_scan (#18630)
2026-01-08 Adrien Gallouëtvendor : update cpp-httplib to 0.30.0 (#18660)
2026-01-08 Georgi Gerganovscripts : support chaining commands in pr2wt.sh (#18671)
2026-01-08 도로로도로또metal : add MoE kernel specialization for ne20=5 (...
2026-01-08 Johannes Gäßlerllama-fit-params: free memory target per device (#18679)
2026-01-08 Doctor Shotgunggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535)
2026-01-08 Daniel Beveniusmodel-conversion : add warn about transformers mismatch...
2026-01-08 Daniel Beveniusmodel-conversion : remove -st targets for converted...
2026-01-08 Julius Tischbeinllama : add `use_direct_io` flag for model loading...
2026-01-08 shaofeiqiopencl: add FILL op support (#18682)
2026-01-07 Sigbjørn Skjæretscripts : fix repos cloned with .git extension (#18669)
2026-01-07 Sigbjørn Skjæretconvert : more variants of rope_theta config entries...
2026-01-07 Oliver Walshcuda : fix build on cuda 12.8 (#18672)
2026-01-07 Rfix(docker): add missing libglvnd libraries to Vulkan...
2026-01-07 Adrien Gallouëttools : remove llama-run (#18661)
2026-01-07 Georgi Gerganovscripts : add pr2wt.sh (#18644)
2026-01-07 Daniel Beveniusconvert : clarify sentence-transformers-dense-modules...
2026-01-07 Sigbjørn Skjæretci : run cann build unconditionally [no ci] (#18659)
2026-01-07 Jeff Bolzvulkan: reject ops when a tensor is too large to alloca...
2026-01-07 virajwadvulkan: Warptile tuning for Intel Xe2/Xe3 (#18178)
2026-01-07 Evevulkan: more mul mat optimizations (#18533)
2026-01-07 Daniel Beveniusexamples : add debug utility/example (#18464)
2026-01-07 hipuddingCANN: Fix rename for get_env (#18652)
2026-01-07 Raul TorresCANN: Rename `get_env` to `get_env_as_lowercase` (...
2026-01-07 Max KrasnyanskyHexagon add support for f16/f32 flash attention, scale...
2026-01-06 Tarek Dakhranmtmd: mtmd_audio_streaming_istft (#18645)
2026-01-06 Johannes Gäßlerllama-params-fit: fix last devices with low VRAM (...
2026-01-06 Aadeshveer... ggml : optimize cuda ssm_scan using warp-level reductio...
2026-01-06 Xuan-Son Nguyenarg: use CSV escape style for multiple-value args ...
2026-01-06 Jeff Bolzvulkan: support buffer_from_host_ptr (#18467)
2026-01-06 Aman Guptaggml-cuda: refactor cuda graph usage (#18637)
2026-01-06 Beinseziimmq.cu: tune mmq/rocblas switching for RDNA (#18537)
2026-01-06 Rserver : add thinking content blocks to Anthropic Messa...
2026-01-06 Christian Kastnergguf-py : add requests to dependencies (#18629)
2026-01-06 Adrien Gallouëtggml : fix avx512bf16 build (#18623)
2026-01-06 Raul TorresCANN: Make `valid_values` variable `static const` ...
2026-01-05 nwyinggml webgpu: add CEIL operation support (#18605)
2026-01-05 Tarek Dakhranmodel : add LFM2-ColBert-350M (#18607)
2026-01-05 Johannes GäßlerCUDA: fix FA FP16 accumulator overflow for Granite...
2026-01-05 ttadd YoutuVLForConditionalGeneration architectures ...
2026-01-05 Aman Guptaggml-cuda: check for srcs outside the cgraph (#18583)
2026-01-05 Vladislav Sayapinserver : fix router child env in containerized environm...
2026-01-05 Jeff Bolzvulkan: fix topk_moe_sigmoid_norm_bias failures in...
2026-01-05 Georgi Gerganovmodels : fix backend assignment for Granite/Nemotron...
2026-01-05 Jeff Bolzvulkan: handle quantize_q8_1 overflowing the max workgr...
2026-01-05 Sigbjørn Skjæretllama : refactor rope_freq_base/scale_swa conversion...
2026-01-05 Chenguang LiCANN: add operator fusion support for ADD + RMS_NORM...
2026-01-05 Francisco Herreradoc: clarify that steps also apply to linux for opencl...
2026-01-05 Ali Tariqci : init git lfs in every build for RISC-V (#18590)
2026-01-04 Daniel Beveniussampling : add support for backend sampling (#17004)
2026-01-04 Tarek Dakhranmodel : mtmd : make input norm optional in LFM2-VL...
2026-01-04 Aman GuptaCUDA: disable cuda graph when using n-cpu-moe (#18593)
2026-01-04 Aman Guptaggml-cuda: remove unused params in ggml_cuda_graph...
2026-01-03 Aldehir Rojascommon/grammar : replace problematic backtracking regex...
2026-01-03 Georgi Gerganovgraph : fix graph reuse logic when `n_pos_per_embd...
2026-01-03 Aman Guptaggml-cuda: fixes for concurrent streams (#18496)
2026-01-03 Georgi Gerganovcontext : fix reserve token padding to n_seqs (#18536)
2026-01-03 Johannes GäßlerCUDA: only allocate FA tmp buffer if needed (#18564)
2026-01-03 pl752(Bugfix, ggml-cuda) Pool alloc count fix + small size...
2026-01-03 Shouyuggml-hexagon: optimize activation function (#18393)
2026-01-02 Jeff Bolzvulkan: Optimize GGML_OP_CUMSUM (#18417)
2026-01-02 Jeff Bolzvulkan: Implement mmvq for iq1_s/iq1_m (#18450)
2026-01-02 Prabodmodel : Maincoder-1B support (#18534)
2026-01-02 Georgi Gerganovmetal : adjust extra size for FA buffer to avoid reallo...
2026-01-02 Georgi Gerganovgraph : reduce topology branching (#18548)
2026-01-02 Georgi Gerganovvocab : reduce debug logs about non-EOG control tokens...
2026-01-02 Chris Rohlfrpc : use unordered_map::reserve and emplace (#18513)
2026-01-01 MeeMincuda : fix copy of large tensors (ggml_nbytes <= INT_MA...
2026-01-01 Sigbjørn Skjæretmodel : remove modern-bert iswa template (#18529)
2026-01-01 ttmodel: support youtu-vl model (#18479)
2026-01-01 Piotr Wilkin... Add conversion support for IQuestCoderForCausalLM ...
2026-01-01 o7simodel : add support for JinaBertModel with non-gated...
2026-01-01 o7siconvert : fix encoding of WPM vocab for BERT models...
2026-01-01 HelloKSmodel: add Solar Open model (#18511)
2026-01-01 Anri Lombardwebui: fix code copy stripping XML/HTML tags (#18518)
2026-01-01 Aman Guptaggml-cuda: remove unneccesary prints on ggml_cuda_init...
2026-01-01 Jeff Bolzvulkan: extend topk_moe to handle sigmoid w/exp_probs_b...
2026-01-01 triplenomllama: handle short reads in direct I/O path (#18504) upstream/0.0.7599
2025-12-31 Anri Lombardchat: make tool description and parameters optional...
2025-12-31 Georgi Gerganovsync : ggml
2025-12-31 Georgi Gerganovggml : bump version to 0.9.5 (ggml/1410)
2025-12-31 Anri Lombardquantize: prevent input/output file collision (#18451)
2025-12-31 Sigbjørn Skjæretconvert : lint fix (#18507)
2025-12-31 Henry147147mtmd : Adding support for Nvidia Music Flamingo Model...
2025-12-31 gatbontonpcmetal : add count_equal op (#18314)
2025-12-31 Johannes GäßlerCUDA: fix KQ max calculation (#18487)
2025-12-31 Georgi Gerganovmetal : remove BF16 x F16 kernels (#18456)
2025-12-31 Aman Guptasycl: add newline at the end of CMakeLists.txt (#18503)
2025-12-31 Rahul SatheWork around broken IntelSYCLConfig.cmake in Intel oneAP...
2025-12-30 Sigbjørn Skjæretdocker : add CUDA 13.1 image build (#18441)
2025-12-30 Bart Louwersdocs : document that JSON Schema is not available to...
2025-12-30 Aldehir Rojascommon : default content to an empty string (#18485)
2025-12-30 Daniel Beveniusllama : fix typo in comment in llama-kv-cache.h [no...
next