]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2026-01-15 Max Krasnyanskyhexagon: support for OP_CPY, host buffers now optional...
2026-01-15 Oliver SimonsCUDA: Factor out and re-use `block_reduce` function...
2026-01-14 Piotr Wilkin... Restore clip's cb() to its rightful glory - extract...
2026-01-14 Junwon Hwangmodel : clean up and fix EXAONE-MoE configuration ...
2026-01-14 Adrien Gallouëtrefactor : remove libcurl, use OpenSSL when available...
2026-01-14 Jeff Bolzvulkan: Check maxStorageBufferRange in supports_op...
2026-01-14 Aman Guptallama-model: fix unfortunate typo (#18832)
2026-01-14 Daniel BeveniusCUDA : fix typo in clang pragma comment [no ci] (#18830)
2026-01-14 Ruben Ortlamvulkan: work around Intel fp16 bug in mmq (#18814)
2026-01-14 Perry Naseckggml-metal: do not copy headers for embedded, use curre...
2026-01-14 Daniel Benjaminssonmmap: add Haiku support by skipping RLIMIT_MEMLOCK...
2026-01-14 Adrien Gallouëtci, tests : use cmake to download models and remove...
2026-01-13 ddh0llama : print_info alignment fix (#18708)
2026-01-13 Junwon Hwangmodel : add EXAONE MoE (#18543)
2026-01-13 Georgi Gerganovvocab : fix attribute overrides for harmony (#18806)
2026-01-13 Ruben Ortlamllama-mmap: fix direct-io loading fallback EOF exceptio...
2026-01-13 Daniel Beveniusmodel-conversion : remove -c 0 from model card template...
2026-01-13 yuloHIP: add fattn-mma-f16 for RDNA4 (#18481)
2026-01-13 Johannes Gäßlerdoc: ban AI-generated PR descriptions [no ci] (#18765)
2026-01-13 Xuan-Son Nguyenmtmd: fix use_non_causal being reported incorrectly... upstream/0.0.7721
2026-01-13 Georgi GerganovCUDA : fix unused argument when USE_CUDA_GRAPH=OFF...
2026-01-13 Gabe Goodhartgraph : clean up t5 input builders (#18795)
2026-01-13 Ruben Ortlamllama-bench: add direct_io parameter (#18778)
2026-01-12 Adrien Gallouëtci : remove libcurl in releases (#18775)
2026-01-12 Radoslav Gerganovserver : add arg for disabling prompt caching (#18776)
2026-01-12 Adrien Gallouëtci : use openssl for openEuler-latest-cmake-cann (...
2026-01-12 Adrien Gallouëtvendor : update cpp-httplib to 0.30.1 (#18771)
2026-01-12 Daniel Beveniusexamples : add --kv-unified to batched example (#18774)
2026-01-12 Jeff Bolzvulkan: change memory_logger to be controlled by an...
2026-01-12 Xuan-Son Nguyenserver: update docs for sleeping [no ci] (#18777)
2026-01-12 Jeff Bolzvulkan: Use VK_EXT_shader_64bit_indexing to handle...
2026-01-12 Ruben Ortlamvulkan: Disable large coopmat matmul configuration...
2026-01-11 Xuan-Son Nguyenmodel: fix qwen3next broken due to #18683 (#18762)
2026-01-11 Ruben OrtlamVulkan: Optimize Matmul parameters for AMD GPUs with...
2026-01-11 Xuan-Son Nguyensecurity: make it clear about subtopics in server ...
2026-01-11 Daniel Beveniusdebug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooli...
2026-01-11 Georgi Gerganovtests : refactor test-backend-sampler (#18753)
2026-01-11 Xuan-Son Nguyenmodel: try to improve Qwen3 Next (#18683)
2026-01-11 thom-dev-frreadme : update UIs (#18751)
2026-01-11 Xuan-Son Nguyensecurity: narrow down the scope of what we consider...
2026-01-11 shaofeiqiopencl: add SOFTPLUS op support (#18726)
2026-01-10 Aman Guptatest-backend-ops: fix mxfp4 tests on blackwell (#18736)
2026-01-10 Johannes GäßlerHIP: adjust RDNA3.5 MMQ kernel selction logic (#18666)
2026-01-10 Perry Naseckcmake : update blas logic (#18205)
2026-01-10 Georgi Gerganovserver : adjust unified KV cache tests (#18716)
2026-01-10 Sigbjørn Skjæretscripts : follow api redirects in pr2wt.sh (#18739)
2026-01-10 Xuan-Son Nguyenpreset: allow named remote preset (#18728)
2026-01-10 Aaron Teodocs(ggml): update backend ops (#18734)
2026-01-10 Michael WandCorrected: changed s13 = src1->nb[3] instead of nb...
2026-01-10 Adrien Gallouëtcommon : add --license to display embedded licenses...
2026-01-09 Xuan-Son Nguyenserver: fix n_cmpl not skipping processing prompt ...
2026-01-09 Simranjeet... mtmd: Add Gemma3n multimodal support with MobileNetV5...
2026-01-09 shaofeiqiopencl: add EXPM1 op (#18704)
2026-01-09 Reese LevineUpdates to webgpu get_memory (#18707)
2026-01-09 PascalWebui/file upload (#18694)
2026-01-09 Asbjørn Ollingcmake: only build cli when server is enabled (#18670)
2026-01-09 Georgi Gerganovserver : fix timing of prompt/generation (#18713)
2026-01-09 Georgi Gerganovscripts : pr2wt.sh reset to remote head (#18695)
2026-01-09 Georgi Gerganovserver : use different seeds for child completions...
2026-01-08 Xuan-Son Nguyencommon: support remote preset (#18520)
2026-01-08 Aaron Teollama: use host memory if device reports 0 memory ...
2026-01-08 Masashi Yoshimuraggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten...
2026-01-08 Reese Levineggml webgpu: initial flashattention implementation...
2026-01-08 Jeff Bolzvulkan: fix push constant size for quantize_q8_1 (...
2026-01-08 Jeff Bolzvulkan: optimize ssm_scan (#18630)
2026-01-08 Adrien Gallouëtvendor : update cpp-httplib to 0.30.0 (#18660)
2026-01-08 Georgi Gerganovscripts : support chaining commands in pr2wt.sh (#18671)
2026-01-08 도로로도로또metal : add MoE kernel specialization for ne20=5 (...
2026-01-08 Johannes Gäßlerllama-fit-params: free memory target per device (#18679)
2026-01-08 Doctor Shotgunggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535)
2026-01-08 Daniel Beveniusmodel-conversion : add warn about transformers mismatch...
2026-01-08 Daniel Beveniusmodel-conversion : remove -st targets for converted...
2026-01-08 Julius Tischbeinllama : add `use_direct_io` flag for model loading...
2026-01-08 shaofeiqiopencl: add FILL op support (#18682)
2026-01-07 Sigbjørn Skjæretscripts : fix repos cloned with .git extension (#18669)
2026-01-07 Sigbjørn Skjæretconvert : more variants of rope_theta config entries...
2026-01-07 Oliver Walshcuda : fix build on cuda 12.8 (#18672)
2026-01-07 Rfix(docker): add missing libglvnd libraries to Vulkan...
2026-01-07 Adrien Gallouëttools : remove llama-run (#18661)
2026-01-07 Georgi Gerganovscripts : add pr2wt.sh (#18644)
2026-01-07 Daniel Beveniusconvert : clarify sentence-transformers-dense-modules...
2026-01-07 Sigbjørn Skjæretci : run cann build unconditionally [no ci] (#18659)
2026-01-07 Jeff Bolzvulkan: reject ops when a tensor is too large to alloca...
2026-01-07 virajwadvulkan: Warptile tuning for Intel Xe2/Xe3 (#18178)
2026-01-07 Evevulkan: more mul mat optimizations (#18533)
2026-01-07 Daniel Beveniusexamples : add debug utility/example (#18464)
2026-01-07 hipuddingCANN: Fix rename for get_env (#18652)
2026-01-07 Raul TorresCANN: Rename `get_env` to `get_env_as_lowercase` (...
2026-01-07 Max KrasnyanskyHexagon add support for f16/f32 flash attention, scale...
2026-01-06 Tarek Dakhranmtmd: mtmd_audio_streaming_istft (#18645)
2026-01-06 Johannes Gäßlerllama-params-fit: fix last devices with low VRAM (...
2026-01-06 Aadeshveer... ggml : optimize cuda ssm_scan using warp-level reductio...
2026-01-06 Xuan-Son Nguyenarg: use CSV escape style for multiple-value args ...
2026-01-06 Jeff Bolzvulkan: support buffer_from_host_ptr (#18467)
2026-01-06 Aman Guptaggml-cuda: refactor cuda graph usage (#18637)
2026-01-06 Beinseziimmq.cu: tune mmq/rocblas switching for RDNA (#18537)
2026-01-06 Rserver : add thinking content blocks to Anthropic Messa...
2026-01-06 Christian Kastnergguf-py : add requests to dependencies (#18629)
2026-01-06 Adrien Gallouëtggml : fix avx512bf16 build (#18623)
2026-01-06 Raul TorresCANN: Make `valid_values` variable `static const` ...
next