]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2025-11-09 compiladeconvert : handle compressed-tensors quant method (...
2025-11-09 Georgi Gerganovserver : handle failures to restore host cache (#17078)
2025-11-09 Georgi Gerganovbenches : add folder with benchmarks (#16931)
2025-11-09 Eric CurtinSwitch to using Ubuntu 25.10 vulkan/mesa (#16497)
2025-11-09 Ruben Ortlamvulkan: iGPU memory reporting fix (#17110)
2025-11-09 Ruben Ortlamvulkan: fix mmq out of bounds reads (#17108)
2025-11-09 Jeff Bolzvulkan: fuse mul_mat_id + mul (#17095)
2025-11-09 Georgi Gerganovmetal : retain src and dst buffers during async ops...
2025-11-08 Xuan-Son Nguyenarg: add --cache-list argument to list cached models...
2025-11-08 chansikparkwebui: fix keyboard shortcuts for new chat & edit chat...
2025-11-08 Jeff Bolzvulkan: Use spec constants for conv2d s/d/p and kernel...
2025-11-08 Aidanserver: fix correct time_ms calculation in prompt_progr...
2025-11-08 Aman GuptaRevert "CUDA: add expert reduce kernel (#16857)" (...
2025-11-08 Aman GuptaCUDA: skip fusion for repeating adds in bias (#17080)
2025-11-08 SavicStefanvulkan: Increase BK to 32; use BK/4 for non-CM mul_mm...
2025-11-08 Aleksei Nikiforovggml: disable vxe for cross-compilation by default...
2025-11-08 Jeff Bolzvulkan: fuse rms_norm + mul + rope (+ view + set_rows...
2025-11-08 Jeff Bolzvulkan: Fix test-thread-safety crashes (#17024)
2025-11-08 Johannes GäßlerCUDA: fix MMQ stream-k fixup ne1 indices (#17089)
2025-11-08 Reese Levineggml webgpu: faster matrix multiplication/matrix-vector...
2025-11-07 bssrdfCUDA: properly handle nb00=nb02 case for cpy (#17081)
2025-11-07 Aclyvulkan : refactor buffer handling in vk_op_f32 (#16840)
2025-11-07 Johannes GäßlerCUDA: fix should_use_mmvf for ne11 == 1 (#17085)
2025-11-07 Georgi Gerganovbench : cache the llama_context state at computed depth...
2025-11-07 Sigbjørn Skjærethparams : add n_embd_inp() to support extended embed...
2025-11-07 Georgi Gerganovkv-cache : pad the cache size to 256 for performance...
2025-11-07 Adrien GallouëtRevert "ggml-cpu: detect correct cpu flags for arm64...
2025-11-07 ironggml-cpu: detect correct cpu flags for arm64 (#16229...
2025-11-07 Georgi Gerganovserver : print the samplers chain for each request...
2025-11-07 Xuan-Son Nguyencommon: move download functions to download.(cpp|h...
2025-11-06 xctanggml-cpu : optimize RVV q2_k and q3_k kernels (#16887)
2025-11-06 Johannes GäßlerCUDA: fix crash on uneven context without FA (#16988)
2025-11-06 Georgi Gerganovmetal : initial Metal4 tensor API support (#16634)
2025-11-06 Georgi Gerganovserver : disable checkpoints with mtmd (#17045)
2025-11-06 Xuan-Son Nguyenclip: implement minicpm-v sinusoidal embd using GGML...
2025-11-06 YehuditEsycl: add CONCAT operator support (#16047)
2025-11-06 Johannes Gäßlerdocs: explain CUDA 11 compilation [no ci] (#16824)
2025-11-06 l3utterflyggml-hexagon: graceful fallback for older socs where...
2025-11-05 bssrdfimprove CUDA cpy memory bandwidth when copying transpos...
2025-11-05 Jeff Bolzvulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle...
2025-11-05 Gabe Goodhartexamples(gguf): GGUF example outputs (#17025)
2025-11-05 Xuan-Son Nguyenmtmd: allow QwenVL to process larger image by default...
2025-11-05 Georgi Gerganovserver : do not default to multiple slots with speculat...
2025-11-05 Xuan-Son Nguyenmtmd: improve struct initialization (#16981)
2025-11-05 손희준docs: Clarify the endpoint that webui uses (#17001)
2025-11-05 Li Pengzhanmodel : add openPangu-Embedded (#16941)
2025-11-05 Reese Levineggml webgpu: minor set rows optimization (#16810)
2025-11-05 Georgi Gerganovsync : ggml
2025-11-05 Georgi Gerganovggml : fix conv2d_dw SVE path (ggml/1380)
2025-11-05 mnehete32CUDA: update ops.md (#17005)
2025-11-05 lhezopencl: update doc (#17011)
2025-11-04 nullnamerefactor: replace sprintf with snprintf for safer strin...
2025-11-04 Jeff Bolzvulkan: remove the need for the dryrun (#16826)
2025-11-04 Georgi Gerganovserver : do context shift only while generating (#17000)
2025-11-04 Georgi Gerganovreadme : update hot topics (#17002)
2025-11-04 Aclyggml-cpu : bicubic interpolation (#16891)
2025-11-04 Sigbjørn Skjæretci : apply model label to models (#16994)
2025-11-04 Sigbjørn Skjæretchore : fix models indent after refactor (#16992)
2025-11-04 NoahFix garbled output with REPACK at high thread counts...
2025-11-04 Aman GuptaCUDA: avoid mul + bias fusion when doing fusion (#16935)
2025-11-03 lhezopencl: support imrope (#16914)
2025-11-03 Aleksander... fix: Viewing multiple PDF attachments (#16974)
2025-11-03 Daniel Beveniusmodel-conversion : pass config to from_pretrained ...
2025-11-03 Georgi Gerganovserver : add props.model_alias (#16943)
2025-11-03 theo77186ggml: CUDA: add head size 72 for flash-attn (#16962)
2025-11-03 Xuan-Son Nguyenmtmd: add --image-min/max-tokens (#16921)
2025-11-03 Xuan-Son Nguyenmtmd: pad mask for qwen2.5vl (#16954)
2025-11-03 Jinyang Heggml : LoongArch fixes (#16958)
2025-11-03 Olivier Chafiksync: minja (glm 4.6 & minmax m2 templates) (#16949)
2025-11-03 shani-fSYCL: optimized repeat_back kernel (3× fewer asm instru...
2025-11-02 Sascha Rogmannfeat(webui): improve LaTeX rendering with currency...
2025-11-02 Shagun Beratest-backend-ops : fix segfault in moe-expert-reduce...
2025-11-02 Sigbjørn Skjæretci : disable failing riscv cross build (#16952)
2025-11-02 Zhiyong Wangmodel: add Janus Pro for image understanding (#16906)
2025-11-02 Georgi Gerganovclip : use FA (#16837)
2025-11-02 Georgi Gerganovserver : support unified cache across slots (#16736)
2025-11-02 Aldehir Rojascommon : move gpt-oss reasoning processing to init...
2025-11-02 Adrian Lundbergdocs: remove llama_sampler_accept reference in sampling...
2025-11-02 mnehete32CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (#16917)
2025-11-02 Aaron Teodevops: fix failing s390x docker build (#16918)
2025-11-02 Aaron Teoggml: add s390x cpu-feats (#16774)
2025-11-01 Georgi Gerganovscripts : add script to bench models (#16894)
2025-11-01 Pascalwebui: auto-refresh /props on inference start to resync...
2025-11-01 Pascalwebui: add HTML/JS preview support to MarkdownContent...
2025-11-01 Adrien Gallouëtvendor : update cpp-httplib to 0.27.0 (#16846)
2025-11-01 Xuan-Son Nguyenmtmd: refactor preprocessing + support max/min pixels...
2025-11-01 Aleksander... Add a setting to display message generation statistics...
2025-11-01 Jaromír Hradílekwebui: recognize AsciiDoc files as valid text files...
2025-11-01 Sigbjørn Skjæretcommon : allow --system-prompt-file for diffusion-cli...
2025-11-01 Sigbjørn Skjæretcodeowners : update after refactor (#16905)
2025-11-01 Jeff Bolzvulkan: Fix multi_add invalid descriptor usage (#16899)
2025-11-01 Jeff Bolzvulkan: fuse mul_mat+add and mul_mat_id+add_id (#16868)
2025-11-01 Oliver SimonsCUDA: Remove unneded bias/gate dims in fused mmvq ...
2025-10-31 Piotr Wilkin... refactor : llama-model.cpp (#16252)
2025-10-31 Piotr Wilkin... model : Minimax M2 (#16831)
2025-10-31 Giuseppe Scrivanomodel : add Granite Hybrid nano types (#16896)
2025-10-31 Johannes GäßlerCUDA: Volta tensor core support for MMF (#16843)
2025-10-31 Georgi Gerganovsync : ggml
2025-10-31 Aman GuptaCUDA: add expert reduce kernel (#16857)
2025-10-31 Georgi Gerganovbatch : fix consistency checks for the input positions...
next