]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-12-11 Georgi GerganovMerge pull request #10788 from ggerganov/gg/gguf-py...
2024-12-11 Georgi Gerganovgguf-py : bump version to 0.11.0
2024-12-11 Xuan Son Nguyenserver : (UI) add tok/s, get rid of completion.js ...
2024-12-11 qingy1337Update README.md (#10772)
2024-12-11 Xuan Son Nguyenci : pin nodejs to 22.11.0 (#10779)
2024-12-11 kallewoofbug-fix: snprintf prints NULL in place of the last...
2024-12-11 CentricStormdocs: fix server documentation formatting (#10776)
2024-12-11 Gilad S.ggml: load all backends from a user-provided search...
2024-12-10 Jeff Bolzvulkan: request round-to-even for fp16 in im2col/rope_h...
2024-12-10 Evevulkan: dynamic subgroup size for the remaining k quant...
2024-12-10 Bartowskiimatrix : Add imatrix to --no-context-shift (#10766)
2024-12-10 Andreas KieslingerCUDA: rename macros to avoid conflicts with WinAPI...
2024-12-10 Yügserver : add flag to disable the web-ui (#10762) (...
2024-12-10 Jeff Bolzvulkan: disable spirv-opt for coopmat shaders (#10763)
2024-12-09 Johannes GäßlerCUDA: fix shared memory access condition for mmv (...
2024-12-09 Srihari-mcwChanges to CMakePresets.json to add ninja clang target...
2024-12-09 Jeff Bolzvulkan: fix compile warnings (#10731)
2024-12-09 Borislav Stanimirovcmake : simplify msvc charsets (#10672)
2024-12-08 Xuan Son Nguyenserver : fix format_infill (#10724)
2024-12-08 Xuan Son Nguyenserver : bring back info of final chunk in stream mode...
2024-12-08 stduhpfVulkan: fix NaN in tanh.comp with AMD proprietary drive...
2024-12-08 Diego Devesallama : use cmake for swift build (#10525)
2024-12-08 Jeff Bolzvulkan: compile a test shader in cmake to check for...
2024-12-07 Robert Collinsllama : add 128k yarn context for Qwen (#10698)
2024-12-07 Xuan Son Nguyenserver : (refactor) no more json in server_task input...
2024-12-07 Georgi Gerganovggml : disable iq4_nl interleave size 8 (#10709)
2024-12-07 Georgi Gerganovserver : various fixes (#10704)
2024-12-07 Djip007ggml : refactor online repacking (#10446)
2024-12-07 Georgi Gerganovserver : fix free of spec context and batch (#10651)
2024-12-07 0cc4mVulkan: VK_KHR_cooperative_matrix support to speed...
2024-12-07 Robert Ormandimetal : Extend how Llama.cpp locates metal resources...
2024-12-07 Sukriti Sharmaconvert : add support for Roberta embeddings (#10695)
2024-12-06 Georgi Gerganovconvert : add custom attention mapping
2024-12-06 Xuan Son Nguyencommon : bring back --no-warmup to server (#10686)
2024-12-06 Xuan Son Nguyenserver : (refactoring) do not rely on JSON internally...
2024-12-05 Plamen Minevfix(server) : not show alert when DONE is received...
2024-12-05 Jeff Bolzvulkan: Add VK_NV_cooperative_matrix2 support for mul_m...
2024-12-05 Riccardo Orlandollama : add Minerva 7B model support (#10673)
2024-12-05 Georgi Gerganovsync : ggml
2024-12-05 PABggml: add `GGML_SET` Metal kernel + i32 CPU kernel...
2024-12-05 PABggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
2024-12-05 Daniel Beveniuspy : update outdated copy-paste instructions [no ci...
2024-12-04 aryantandon01Update deprecation-warning.cpp (#10619)
2024-12-04 Georgi Gerganovserver : fix speculative decoding with context shift...
2024-12-04 Diego Devesaggml : add predefined list of CPU backend variants...
2024-12-04 Diego Devesaggml-cpu : fix HWCAP2_I8MM value (#10646)
2024-12-04 ltoniazziFix HF repo commit to clone lora test models (#10649)
2024-12-04 JFLFY2255llama: Support MiniCPM-1B (with & w/o longrope) (#10559)
2024-12-04 Jeff Bolzvulkan: Implement "fast divide" (mul+shift) for unary...
2024-12-04 Nicolò ScipioneSYCL : Move to compile time oneMKL interface backend...
2024-12-04 Wang Ran (汪然)fix typo of README.md (#10605)
2024-12-04 Frankie RobertsonAvoid using __fp16 on ARM with old nvcc (#10616)
2024-12-04 Benson WongAdd docs for creating a static build (#10268) (#10630)
2024-12-04 piDackclip : add sycl support (#10574)
2024-12-03 Jeff Bolzvulkan: optimize and reenable split_k (#10637)
2024-12-03 Xuan Son Nguyenserver : (web ui) Various improvements, now use vite...
2024-12-03 Georgi Gerganovscripts : remove amx sync
2024-12-03 Georgi Gerganovsync : ggml
2024-12-03 mahorozteCUDA: remove unnecessary warp reduce in FA (ggml/1032)
2024-12-03 PABfeat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml...
2024-12-03 PABmetal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml...
2024-12-03 Xuan Son Nguyenllama : add missing LLAMA_API for llama_chat_builtin_te...
2024-12-03 Nikolaos Pothitosreadme : add option, update default value, fix formatti...
2024-12-03 Georgi Gerganovmetal : small-batch mat-mul kernels (#10581)
2024-12-03 Georgi Gerganovgithub : minify link [no ci] (revert)
2024-12-03 Georgi Gerganovgithub : minify link [no ci]
2024-12-03 Georgi Gerganovserver : fix default draft model parameters (#10586)
2024-12-02 Xuan Son Nguyenllama : add enum for built-in chat templates (#10623)
2024-12-02 Georgi Gerganovmake : deprecate (#10514)
2024-12-02 haopengserver: Add "tokens per second" information in the...
2024-12-02 Akarshan BiswasSYCL: Fix and switch to GGML_LOG system instead of...
2024-12-02 Georgi Gerganovcontrib : refresh (#10593)
2024-12-01 Juk ArmstrongAdd `mistral-v1`, `mistral-v3`, `mistral-v3-tekken...
2024-12-01 Georgi Gerganovgrammars : add English-only grammar (#10612)
2024-12-01 Wang Qinci: add error handling for Python venv creation in...
2024-12-01 Diego Devesaggml : automatic selection of best CPU backend (#10606)
2024-12-01 alek3yserver : bind to any port when specified (#10590)
2024-12-01 Georgi Gerganovreadme : update the usage section with examples (#10596)
2024-12-01 Wang Qinbuild: update Makefile comments for C++ version change...
2024-11-30 Adrien Gallouëtggml-cpu: replace AArch64 NEON assembly with intrinsics...
2024-11-30 Georgi Gerganovreadme : remove old badge
2024-11-30 Georgi Gerganovreadme : refresh (#10587)
2024-11-30 Evevulkan: Dynamic subgroup size support for Q6_K mat_vec...
2024-11-29 Diego Devesaggml : move AMX to the CPU backend (#10570)
2024-11-29 Xuan Son Nguyenserver : add more test cases (#10569)
2024-11-29 Robert Collinsimatrix : support combine-only (#10492)
2024-11-29 Diego Devesacleanup UI link list (#10577)
2024-11-29 Georgi Gerganovggml : fix I8MM Q4_1 scaling factor conversion (#10562)
2024-11-29 Shupei Fanggml-cpu: fix typo in gemv/gemm iq4_nl_4_4 (#10580)
2024-11-29 Alberto Cabrera... sycl : offload of get_rows set to 0 (#10432)
2024-11-29 Alberto Cabrera... sycl : Reroute permuted mul_mats through oneMKL (#10408)
2024-11-29 Chenguang LiCANN: RoPE operator optimization (#10563)
2024-11-29 Jeff Bolzvulkan: get the first command buffer submitted sooner...
2024-11-29 Ting Loullava: return false instead of exit (#10546)
2024-11-28 Georgi Gerganovggml : remove redundant copyright notice + update authors
2024-11-28 Georgi Gerganovllama : add missing model types
2024-11-28 Xuan Son Nguyenserver : (tests) don't use thread for capturing stdout...
2024-11-28 Johannes Gäßlercommon: fix warning message when no GPU found (#10564)
2024-11-28 Random Flydocs: fix outdated usage of llama-simple (#10565)
2024-11-28 Diego Devesaci : fix tag name in cuda and hip releases (#10566)
next