]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2025-12-25 Aman Guptacuda: optimize cumsum cub path (#18362)
2025-12-25 Aman Guptaggml-cuda: fix blackwell native builds (#18361)
2025-12-25 Penglin CaiCANN: Add support for CONV_TRANSPOSE_1D when kernel...
2025-12-25 Aadeshveer... ggml : optimize cuda cumsum fallback kernel (#18343)
2025-12-24 Xuan-Son Nguyenserver: (router) add stop-timeout option (#18350)
2025-12-24 Xuan-Son Nguyenmodel: support MiMo-V2-Flash (#18328)
2025-12-24 Aadeshveer... fit-params : fix race condition in fit-params output...
2025-12-24 Aman GuptaCUDA: experimental native mxfp4 support for blackwell...
2025-12-24 Saba Fallahmodel : support for LlamaBidirectionalModel architectur...
2025-12-24 Jeff Bolzvulkan: fix command buffer corruption in ggml_backend_v...
2025-12-24 Wang WeixuanCANN : refactor ACL graph cache (#17752)
2025-12-24 Jesse Ikonendocs: Fix typos in SYCL documentation (#18269)
2025-12-24 Ruben Ortlamvulkan: use fewer FA rows for small cache runs (#18280)
2025-12-24 TianHao324CANN: Uses yarn_ramp cache in ROPE (#17725)
2025-12-24 ddh0common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for...
2025-12-23 Xuan-Son Nguyenserver: return_progress to also report 0% processing...
2025-12-23 Pascalwebui: apply webui_settings on first load (#18223)
2025-12-23 Xuan-Son Nguyenserver: fix crash with model not having BOS/EOS (#18321)
2025-12-23 Daniel Beveniusmodel-conversion : add device option to run-org-model...
2025-12-23 Chris Rohlfrpc : add check for rpc buffer type (#18242)
2025-12-23 nullnameggml-hexagon: create generalized functions for cpu...
2025-12-23 Daniel Beveniusmodel-conversion : add trust_remote_code for embedding...
2025-12-23 Neo Zhang[SYCL] replace llama-cli by llama-completion to rm...
2025-12-23 Alessandro98-gitmodel : fix div-by-zero for Nemotron V2 (#18309)
2025-12-22 Ryan Mangenomodel : Granite Embedding support (#15641)
2025-12-22 compiladegguf-py : do not align the data start offset (#18291)
2025-12-22 Shouyuggml-hexagon: gelu optimization (#18151)
2025-12-22 Xuan-Son Nguyengen-docs: automatically update markdown file (#18294)
2025-12-22 Taimur Ahmadllamafile: add rvv support for sgemm kernels (#18199)
2025-12-22 lhezopencl: unpack q4_0 for adreno in get_tensor (#18278)
2025-12-22 Jeff Bolzvulkan: Extend rope fusions to allow mrope (#18264)
2025-12-22 Xuan-Son Nguyenserver: prevent data race from HTTP threads (#18263)
2025-12-22 Xuan-Son Nguyenserver: fix data race in to_json_anthropic (#18283)
2025-12-22 Matttrelease: update release workflow to store XCFramework...
2025-12-22 Aaron Teoconvert: rework ftype heuristics (#18214)
2025-12-22 Xuan-Son Nguyenserver: (docs) remove mention about extra_args (#18262)
2025-12-22 Johannes Gäßlertool/ex/tests: consistently free ctx, then model (...
2025-12-21 Jeff Bolzvulkan: Implement set_tensor_async and the event interf...
2025-12-21 Johannes Gäßlerllama: fix RPC for -fit on (#18233)
2025-12-21 Xuan-Son Nguyenmove copilot instructions to AGENTS.md (#18259)
2025-12-21 Jeff Bolzvulkan: fix im2col overflowing maxworkgroupcount (...
2025-12-21 Jeff Bolzvulkan/cuda: fix topk_moe with exp_probs_b (#18071)
2025-12-21 Jeff Bolzvulkan: support GGML_UNARY_OP_XIELU (#18062)
2025-12-21 Jeff Bolzvulkan: in graph_optimize, try to group ADD operations...
2025-12-21 lovedheartVulkan: some improvement on mul_mat_iq2_xs (#18031)
2025-12-21 Daniel Beveniusdocs : fix links in parsing.md (#18245)
2025-12-21 Aldehir Rojascommon : reorganize includes to prioritize vendored...
2025-12-21 Xuan-Son Nguyenserver: add auto-sleep after N seconds of idle (#18228)
2025-12-20 Jeff Bolztests: Avoid floating point precision false positives...
2025-12-20 Jeff Bolztest-backend-ops: improve msvc build time (#18209)
2025-12-20 Aadeshveer... Added comments explaining thread block size selection...
2025-12-20 Oleksandr Kuvshynovserver : [easy] fix per round speculative decode loggin...
2025-12-20 Xuan-Son Nguyenserver: support load model on startup, support preset...
2025-12-19 Sigbjørn Skjæretci : remove non-windows zip artifacts (#18201)
2025-12-19 Sigbjørn Skjæretci : only save ccache on master (#18207)
2025-12-19 Alfredggml-hexagon: Implement true Q8_0 quantization on Hexag...
2025-12-19 Pascalarg: fix order to use short form before long form ...
2025-12-19 Julius Tischbeinllama : Changing off_t to size_t for Windows (#18204)
2025-12-19 Aman Guptaserver: friendlier error msg when ctx < input (#18174)
2025-12-19 Xuan-Son Nguyenpresets: refactor, allow cascade presets from different...
2025-12-19 Aleksander... webui: Add editing attachments in user messages (#18147)
2025-12-19 Daniel Beveniusmodel-conversion : add verbose flag in run-org-model...
2025-12-19 Naco Sirenandroid: fix missing screenshots for Android.md (#18156)
2025-12-19 Jeff Bolzvulkan: Add perf logger mode with concurrency (#17944)
2025-12-18 Xuan-Son Nguyenmodel : add ASR support for LFM2-Audio-1.5B (conformer...
2025-12-18 Pascalwebui: display prompt processing stats (#18146)
2025-12-18 Taimur Ahmadggml-cpu: extend support for RVV floating-point kernels...
2025-12-18 Xuan-Son Nguyenarg: fix ASAN error on sampler_type_names empty (#18167)
2025-12-18 Sigbjørn Skjæretgguf-py : use copy-on-write mode for localtensor (...
2025-12-18 yuloremove i_major_dual (#18157)
2025-12-18 Aleksander... webui: Fix selecting generated output issues during...
2025-12-18 Kim S.webui: fix chat screen shadow width (#18010)
2025-12-18 Johannes Gäßlerllama: offload output layer to GPU first (#18148)
2025-12-18 Sigbjørn Skjæretconvert : sort and use file parts from model index...
2025-12-18 Julius Tischbeinllama : Async DirectIO model loading on Linux (#18012)
2025-12-17 Shouyuggml-hexagon: swiglu_oai operation (#18114)
2025-12-17 Sigbjørn Skjæretconvert : force patch_merger tensors to f16/f32 (#18124)
2025-12-17 Pascalserver: (webui) add --webui-config (#18028)
2025-12-17 Xuan-Son Nguyenserver: (router) disable SSL on child process (#18141)
2025-12-17 Johannes Gäßlerllama-fit-params: fix memory print (#18136)
2025-12-17 Kim S.webui: fix chat header width when sidebar is closed...
2025-12-17 Shouyuggml-hexagon: gelu operation (#17921)
2025-12-17 Georgi Gerganovcommon : restore grammar-based rejection sampling ...
2025-12-17 Johannes Gäßlercommon: clarify instructions for bug reports (#18134)
2025-12-17 HonestQiaomodel: fix GLM-ASR-Nano-2512 load error (#18130) (...
2025-12-17 Xuan-Son Nguyenserver: (router) allow child process to report status...
2025-12-17 Piotr Wilkin... Extend run-org-model.py, add (a) batching (b) loading...
2025-12-17 Johannes GäßlerGithub: ask for -v logs for params_fit [no ci] (#18128)
2025-12-17 Alberto Cabrera... ggml-cpu: ARM64: repack version of q8_0 (dotprod and...
2025-12-17 Tarek Dakhranmodel: fix LFM2_MOE missing tensors (#18132)
2025-12-17 Sigbjørn Skjæretci : clean up webui jobs (#18116)
2025-12-17 Pascalcommon: fix --override-kv to support comma-separated...
2025-12-17 yuloHIP: Refactor mma for RDNA and CDNA (#17990)
2025-12-17 Naco Sirenllama.android : Rewrite Android binding (w/o cpu_featur... upstream/0.0.7446
2025-12-17 TrevorSarg: allow -kvu flag for llama-perplexity (#18117)
2025-12-17 Aadeshveer... ggml : use WARP_SIZE/2 for argmax reduction offset...
2025-12-17 Yuri Khrustalevgguf-py : allow converting multi-tensor models from...
2025-12-16 Johannes Gäßlerllama-fit-params: force disable mlock (#18103)
2025-12-16 Johannes Gäßlerllama-fit-params: lower ctx size for multi GPU (#18101)
2025-12-16 Johannes Gäßlerllama-fit-params: fix underflow for dense models (...
next