]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-02-19 Georgi Gerganovci : enable -Werror for CUDA builds (#5579)
2024-02-19 Georgi Gerganovmake : fix CUDA build (#5580)
2024-02-19 valirayreadme : fix typo in README-sycl.md (#5353)
2024-02-19 Abhilash Majumdercmake : remove obsolete sycl compile flags (#5581)
2024-02-19 Georgi Gerganovminor : fix trailing whitespace (#5538)
2024-02-19 Daniel Beveniusllava : avoid changing the original BakLLaVA model...
2024-02-19 NawafAlansaribaby-llama : allocate graphs in ggml_context (#5573)
2024-02-19 Xuan Son Nguyenllama : add llama_chat_apply_template() (#5538)
2024-02-19 slarencuda, metal : fix nans in soft_max (#5574)
2024-02-19 Mirko185readme : update (#5572)
2024-02-19 bmwlggml : android and old glibc NUMA incompatibility bugfi...
2024-02-18 Jared Van Bortelbuild : pass all warning flags to nvcc via -Xcompiler...
2024-02-18 Georgi Gerganovggml : restore vec dot stride arg names (#5453)
2024-02-18 Georgi Gerganovci : fix wikitext url + compile warnings (#5569)
2024-02-18 Georgi Gerganovmetal : fix unused warnings (#0)
2024-02-18 Robey Holderithcommon, server : surface min_keep as its own parameter...
2024-02-18 Pierrick Hymbertserver : slots monitoring endpoint (#5550)
2024-02-18 Georgi Gerganovsampling : do not set min_keep to n_probs (#5564)
2024-02-18 Georgi Gerganovcmake : fix GGML_USE_SYCL typo (#5555)
2024-02-18 Pierrick Hymbertserver : enhanced health endpoint (#5548)
2024-02-18 Pierrick Hymbertserver : --n-predict option document and cap to max...
2024-02-18 Daniel Hiltgenserver : graceful server shutdown (#5244)
2024-02-18 Georgi Gerganovcommon : fix ub (#5530)
2024-02-18 Herman Semenovggml, common, examples, tests : fixed type arguments...
2024-02-18 Daniel Beveniusllava : update surgery script to not remove tensors...
2024-02-18 Kawrakow1.5 bit quantization (#5453)
2024-02-18 github-actions... flake.lock: Update
2024-02-17 Georgi Gerganovggml : add ALiBi support for ggml_soft_max_ext (#5488)
2024-02-17 Ananta Bastolaci : add an option to fail on compile warning (#3952)
2024-02-17 clibdevgitignore : update for CLion IDE (#5544)
2024-02-16 Georgi Gerganovcmake : fix VULKAN and ROCm builds (#5525)
2024-02-16 Georgi Gerganovscripts : add helpers script for bench comparing commit...
2024-02-16 Herman Semenovllava : removed excess free(NULL) operation (#5531)
2024-02-16 Herman Semenovllama : minor fixed return int value (#5529)
2024-02-16 Alexey Parfenovserver : add "samplers" param to control the samplers...
2024-02-16 Rőczey Barnabásserver : fix system prompt cli (#5516)
2024-02-16 bmwlggml : add numa options (#5377)
2024-02-16 Daniel Beveniusllava : fix clip-model-is-vision flag in README.md...
2024-02-16 Georgi Gerganovci : fix BERT model download and convert
2024-02-15 Douglas HanleyUse correct type of pooling for embedding models (...
2024-02-15 Georgi Gerganovclip : fix wrong loop condition
2024-02-15 slarencuda : print message when initialization fails (#5512)
2024-02-15 Georgi Gerganovscripts : add hf.sh helper script (#5501)
2024-02-15 Michaël de... fix(gguf-py): special tokens are no longer skipped...
2024-02-15 Elbiosllava : fix memory management bug (#5491)
2024-02-15 Johnllaba : hotfix for llava-1.6 image number (#5495)
2024-02-15 Neuman Vongvulkan: Find optimal memory type but with fallback...
2024-02-14 Runereadme : fix typo (#5490)
2024-02-14 Johnllava : update README.md (#5489)
2024-02-14 Michael Podvitskiycmake : ARM intrinsics detection for MSVC (#5401)
2024-02-14 Johnllava : support v1.6 (#5267)
2024-02-13 ATEarly return for zero size calls to get_tensor. (#5482)
2024-02-13 Johngguf : add python reader example (#5216)
2024-02-13 Jared Van Bortelllama : add support for Nomic Embed (#5468)
2024-02-13 Aarni Koskelallama : allow raw byte in SPM vocabs; don't crash on...
2024-02-13 Aarni Koskelallama : make load error reporting more granular (#5477)
2024-02-13 Daniel Beveniusfinetune : rename feed-forward tensors (w1/w2/w3) ...
2024-02-13 Georgi Gerganovtests : multi-thread the tokenizer tests (#5474)
2024-02-13 Douglas Hanleyllama : support batched embeddings (#5466)
2024-02-13 Johannes Gäßlermake: add error message for bad CUDA version (#5444)
2024-02-13 Georgi Gerganovbert : add tests + fix quantization (#5475)
2024-02-13 Georgi Gerganovtests : disable moe test (#5473)
2024-02-13 Kawrakowggml-quants : fix compiler warnings (shadow variable...
2024-02-12 Georgi Gerganovllama : fix quantization when tensors are missing ...
2024-02-12 Georgi Gerganovswift : package no longer use ggml dependency (#5465)
2024-02-12 Leepy : fix persimmon `n_rot` conversion (#5460)
2024-02-12 Abhilash Majumderggml-sycl: Replace 3d ops with macro (#5458)
2024-02-12 Daniel Beveniusllava : remove prog parameter from ArgumentParser ...
2024-02-12 Georgi Gerganovsync : ggml (#5452)
2024-02-11 Johannes GäßlerCUDA: mul_mat_vec_q tiling, refactor mul mat logic...
2024-02-11 Douglas HanleyAdd support for BERT embedding models (#5423)
2024-02-11 github-actions... flake.lock: Update
2024-02-11 Sergio Lópezvulkan: only use M-sized matmul on Apple GPUs (#5412)
2024-02-11 Alexey Parfenovcommon : use enums for sampler types (#5418)
2024-02-11 Alexey Parfenovserver : allow to specify tokens as strings in logit_bi...
2024-02-11 Georgi Gerganovmain : ctrl+C print timing in non-interactive mode...
2024-02-11 Georgi Gerganovcommon : fix compile warning
2024-02-11 Georgi Gerganovggml : fix compile warnings (unused vars) (#4966)
2024-02-11 snadampalggml : add mmla kernels for quantized GEMM (#4966)
2024-02-11 Johannes Gäßlerlookup: add print for drafting performance (#5450)
2024-02-11 Xuan Son Nguyenserver : add llama2 chat template (#5425)
2024-02-10 Ian Bullmetal : use autoreleasepool to avoid memory leaks ...
2024-02-10 Georgi Gerganovscripts : update sync scripts with new backends
2024-02-10 Georgi Gerganovsync : ggml
2024-02-10 Michael Podvitskiyggml : add abort_callback for cpu backend (ggml/725)
2024-02-09 Neuman Vongvulkan: Set limit for task concurrency (#5427)
2024-02-09 Daniel Beveniusllava : add requirements.txt and update README.md ...
2024-02-09 Riley Stewartserver : fix prompt caching for repeated prompts (...
2024-02-09 Paul Tsochantarisllama : do not cap thread count when MoE on CPU (#5419)
2024-02-09 Marko Tasicreadme : add JavaScript/Wasm repo (#5415)
2024-02-09 Michael Podvitskiyggml : fix `error C2078: too many initializers` for...
2024-02-09 0cc4mFix Vulkan crash on APUs with very little device memory...
2024-02-08 Johannes GäßlerCUDA: more warps for mmvq on NVIDIA (#5394)
2024-02-08 slarenllama : do not print "offloading layers" message in...
2024-02-08 Abhilash MajumderFix f16_sycl cpy call from Arc (#5411)
2024-02-08 Daniel Beveniusllava : add missing .py, and fix paths in README.md...
2024-02-08 Johannes Gäßlerfix trailing whitespace (#5407)
2024-02-08 runfuturellama : fix MiniCPM (#5392)
2024-02-08 Daniel Beveniusllava: fix typo/formatting in README.md (#5405)
2024-02-08 Johannes Gäßlersampling: fix top_k <= 0 (#5388)
next