]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2025-09-01 Jie Fu (傅杰)docs : add Hunyuan to models section (#15707)
2025-09-01 Akarshan BiswasCUDA: fix build error from ambiguous __half conversions...
2025-09-01 hipuddingCANN: Optimize MUL_MAT_ID (#15658)
2025-09-01 hipuddingCANN: fix RoPE cache issue on multi-device (#15629)
2025-08-31 Georgi Gerganovsampling : optimize samplers by reusing bucket sort...
2025-08-31 Georgi Gerganovserver : enable /slots by default and make it secure...
2025-08-31 Georgi Gerganovmetal : fix checks for available FA kernels (#15700)
2025-08-31 Diego Devesallama : fix fattn reserve call n_seqs parameter (#15699)
2025-08-31 Diego Devesallama : separate compute buffer reserve from fattn...
2025-08-31 Sigbjørn Skjæretci : explicitly set fa off or on (#15692)
2025-08-31 Jeff Bolzvulkan: handle large sizes for get_rows (#15686)
2025-08-31 Jeff Bolzvulkan: mul_mat_id coopmat2 optimizations (#15546)
2025-08-31 Daniel Beveniusvulkan : remove unused portability_enumeration_ext...
2025-08-31 Jeff Bolzvulkan: Allow fallback to sysmem memory when vidmem...
2025-08-31 Jeff Bolzvulkan: clamp matmul and FA results to the max finite...
2025-08-30 Charles Xuggml: update kleidiai to v1.13.0 (#15663)
2025-08-30 Diego DevesaUpdate build.md to remove MSVC arm64 notes (#15684)
2025-08-30 Johannes Gäßlerllama: use FA + max. GPU layers by default (#15434)
2025-08-30 Johannes GäßlerCUDA: use FP32 arithmetic for conv2d (#15683)
2025-08-30 Jeff Bolzvulkan: Skip syncing for prealloc_y when it is reused...
2025-08-30 Chenguang LiCANN: FIx compiler warnings (#15661)
2025-08-29 Sergey Alirzaevserver : removed obsolete doc (#15670)
2025-08-29 Johannes Gäßlerscripts: strip "AMD Instinct" from GPU name (#15668)
2025-08-29 ExtReMLapinserver : add documentation for `parallel_tool_calls...
2025-08-29 Aman GuptaCUDA: fix bug in rms_norm fusion (#15660)
2025-08-29 Piotr Wilkin... chat : Seed OSS thinking + tool call support (#15552)
2025-08-29 Aman GuptaCUDA: fuse adds, fuse add with rms norm (#15631)
2025-08-29 Gabe Goodhartnvidia nemotron nano v2 (nemotronh) (#15507)
2025-08-28 Gabe Goodhartfix: Compute the full sum in llama-eval-callback, not...
2025-08-28 mnehete32CUDA: add conv2d (#15635)
2025-08-28 Aaron Teoggml-cpu: fix invalid hsum build in debug s390x (#15634)
2025-08-28 compiladeggml : fix SSM_SCAN for n_groups > 1 (#15625)
2025-08-28 Georgi Gerganovkv-cache : fix find_slot to not search for continuous...
2025-08-28 Sigbjørn Skjæretmodel : jina-embeddings-v3 support (#13693)
2025-08-28 Aman Guptascripts: add sqlite3 check for compare-commits.sh ...
2025-08-28 Georgi Gerganovkv-cache : remove LLAMA_SET_ROWS checks (#15505)
2025-08-28 Aleksei Nikiforovgguf-py: byteswapping improvements (#12851)
2025-08-28 Joshua Cogliaticli : change log to warning to explain reason for stopp...
2025-08-28 Daniel Beveniusmodel-conversion : add mmproj conversion target (#15628)
2025-08-28 matiaslincuda: Add cublasLt_static linking when GGML_STATIC...
2025-08-27 Johannes Gäßlerserver: higher timeout for tests (#15621)
2025-08-27 Georgi Gerganovpresets : add qwen3-30B-a3b FIM (#15616)
2025-08-27 uvosHIP: Enable support for ggml_backend_cuda_register_host...
2025-08-27 Georgi Gerganovkv-cache : better estimate of n_kv for multi-sequence...
2025-08-27 Chenguang LiCANN: refactor mask handling and improve performance...
2025-08-27 xctanggml-cpu : add basic RVV support for vector f32 ops...
2025-08-27 Daniel Beveniuscommon : add -m to bash completion for --model [no...
2025-08-27 rmatifOpenCL: add fused group_norm/norm, mul, add (#15314)
2025-08-26 Diego Devesatests : fix test-opt with GGML_BACKEND_DL (#15599)
2025-08-26 Akarshan BiswasSYCL: fix rms_norm_mul_add for tensor dim not a multipl...
2025-08-26 fidorielmtmd : fix mtmd ios build (#15579)
2025-08-26 Evetests: add performance test for mul mat id (#15543)
2025-08-26 shalinib-ibmllamafile: PowerPC Sgemm Optimization (#15558)
2025-08-26 Georgi Gerganovgraph : fix assert in memory-less build_attn (#15590)
2025-08-26 Daniel Beveniusmodel-conversion : add qat-q4 quantization targets...
2025-08-26 Johannes GäßlerCUDA: return -1 for nonexistent compiled arch (#15587)
2025-08-26 Georgi Gerganovmetal : optimize FA vec for large sequences and BS...
2025-08-26 Xuan-Son Nguyenmtmd : support Kimi VL model (#15458)
2025-08-26 Georgi Gerganovcontext : print graph stats for memory-less contexts...
2025-08-26 Georgi Gerganovmetal : improve `MUL_MAT_ID` (#15541)
2025-08-26 tc-mbmodel : support MiniCPM-V 4.5 (#15575)
2025-08-26 Sigbjørn Skjæretgguf-py : remove erroneous FFN_GATE entry (#15583)
2025-08-26 Sigbjørn Skjæretmetal : remove contiguous assertion for src0 in IM2COL...
2025-08-26 Yoshi_likes_e4Add a warning for special devices (#15563)
2025-08-26 Jeff Bolzvulkan: Remove splitting for mul_mat_id (#15568)
2025-08-25 QeeweewCUDA: Accelerate MXFP4 table lookup using `__byte_perm...
2025-08-25 lhezopencl: fix support ops condition for `rms_norm` (...
2025-08-25 Ruben Ortlamvulkan: fix min subgroup 16 condition for mmid subgroup...
2025-08-25 Jeff Bolztests: Generate unique input values for count_equal...
2025-08-25 Ihar Hrachyshkametal: fix regression when no metal devices are present...
2025-08-25 Johannes GäßlerCUDA: MoE helper in device code, better tile sizes...
2025-08-25 Daniel Beveniusmodel-conversion : set pooling type to none in logits...
2025-08-25 Daniel Beveniusmodel-conversion : add model card template for embeddin...
2025-08-25 Georgi Gerganovbatched-bench : fix unified KV cache handling + pp...
2025-08-25 Weizhao Ouyangconvert : update Ernie 4.5 dense architecture name...
2025-08-25 Georgi Gerganovmetal : add FA kernels for HS=40 (#15559)
2025-08-25 RunningLeonconvert : support interns1-mini (#15412)
2025-08-25 Chenguang LiCANN: ROPE cache sin/cos repeat (#15501)
2025-08-24 Ruben Ortlamvulkan: apply MUL_MAT_ID subgroup optimization to non...
2025-08-24 Georgi Gerganovkv-cache : support layer reuse (#15504)
2025-08-24 Jeff Bolzvulkan: Support FA with any multiple of 8 head sizes...
2025-08-24 Ruben Ortlamvulkan: enable Conv2D for Apple after MoltenVK fixed...
2025-08-24 Jeff Bolzvulkan: workaround MoltenVK compile failure in multi_ad...
2025-08-23 Johannes GäßlerCUDA: fix half2 -> half conversion for HIP (#15529)
2025-08-23 Jeff Bolzvulkan: optimize rms_norm, and allow the work to spread...
2025-08-23 Piotr Wilkin... model : add support for Seed-OSS (#15490)
2025-08-23 Johannes Gäßlerscripts: fix compare-llama-bench.py (#15521)
2025-08-23 LaffeyNyaachat : fix debug build assertion in trim function ...
2025-08-23 Jeff Bolzvulkan: Rewrite synchronization to allow some overlap...
2025-08-23 R0CKSTARvulkan.Dockerfile: install vulkan SDK using tarball...
2025-08-23 Aclyvulkan : support ggml_mean (#15393)
2025-08-23 Jeff Bolzvulkan: optimize mul_mat_id loading row ids into shared...
2025-08-22 Johannes Gäßlertest-opt: allow slight inprecision (#15503)
2025-08-22 Reese Levineggml WebGPU: add support for quantization types (#15440)
2025-08-22 Aldehir Rojasmodel : gpt-oss add response_format support (#15494)
2025-08-22 rmatifggml: add `conv3d` op (#15182)
2025-08-22 Yavor Ivanovcuda : add Pad Reflect 1D support (#14659)
2025-08-22 Georgi Gerganovllama : remove KV cache defragmentation logic (#15473)
2025-08-22 Aaron Teoggml-cpu: Support Q5_0 and Q5_1 on s390x (#15486)
2025-08-22 65aserver : Support multimodal completion and embeddings...
next