]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-06-27 Sigbjørn SkjæretAdd Qwen2MoE 57B-A14B model identifier (#8158)
2024-06-27 Johannes GäßlerCUDA: fix MMQ stream-k for --split-mode row (#8167)
2024-06-27 kustaayaAdded support for Viking pre-tokenizer (#8135)
2024-06-27 Sigbjørn Skjæretllama : fix CodeLlama FIM token checks (#8144)
2024-06-27 Raj Hammeer... Fix llama-android.cpp for error - "common/common.h...
2024-06-26 Daniel Beveniusclip : suppress unused variable warnings (#8105)
2024-06-26 Georgi Gerganovscripts : fix filename sync
2024-06-26 slarenci : publish new docker images only when the files...
2024-06-26 slarenggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CU...
2024-06-26 slarenmake : fix missing -O3 (#8143)
2024-06-26 Georgi Gerganovsync : ggml
2024-06-26 Georgi Gerganovauthors : regen
2024-06-26 Georgi Gerganovdevops : remove clblast + LLAMA_CUDA -> GGML_CUDA ...
2024-06-26 Georgi Gerganovreadme : update API notes
2024-06-26 Georgi Gerganovllama : reorganize source code + improve CMake (#8006)
2024-06-26 Isaac McFadyenClarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ...
2024-06-26 Johannes GäßlerCUDA: fix misaligned shared memory read (#8123)
2024-06-26 Eddie-Wangllama : extend llm_build_ffn() to support _scale tensor...
2024-06-26 Olivier Chafik`json`: better support for "type" unions (e.g. nullable...
2024-06-26 Olivier Chafik`json`: fix additionalProperties, allow space after...
2024-06-25 jukofyorkfixes #7999 (adds control vectors to all `build_XXX...
2024-06-25 fairydreamingllama : implement Unigram tokenizer needed by T5 and...
2024-06-25 Daniel Beveniusllama : return nullptr from llama_grammar_init (#8093)
2024-06-25 Olivier Chafik`json`: support integer minimum, maximum, exclusiveMini...
2024-06-25 slarendisable docker CI on pull requests (#8110)
2024-06-25 joecryptotooAdd healthchecks to llama-server containers (#8081)
2024-06-25 BrianGguf dump start data offset via --data-offset and some...
2024-06-25 Xuan Son Nguyencvector: better prompt handling, add "mean vector"...
2024-06-25 Xuan Son NguyenAdd chat template support for llama-cli (#8068)
2024-06-25 HanishKVCSimpleChat v3.1: Boolean chat request options in Settin...
2024-06-25 HatsuneMikuUwU33Update control vector help (#8104)
2024-06-25 Meng, Hengyu[SYCL] Re-enabled mul_mat_batched_sycl (#8095)
2024-06-24 Johannes GäßlerCUDA: fix matrix multiplication algorithm choice (...
2024-06-24 Johannes GäßlerCUDA: fix MMQ writeback for int8 tensor cores (#8100)
2024-06-24 Johannes GäßlerCUDA: use MMQ instead of cuBLAS by default (#8075)
2024-06-24 fairydreaminggguf-py : fix tensor groups for encoder-decoder models...
2024-06-24 Johannes GäßlerCUDA: optimize MMQ int8 tensor core performance (#8062)
2024-06-24 Christian Zhou... Option to split during conversion (#6942)
2024-06-24 slarendisable publishing the full-rocm docker image (#8083)
2024-06-24 Yann Folletembedding : more cli arguments (#7458)
2024-06-24 fairydreaminggguf-py, convert-hf : model conversion support for...
2024-06-24 slarenggml : remove ggml_task_type and GGML_PERF (#8017)
2024-06-23 Eddie-Wangllama : add support for BitnetForCausalLM (#7931)
2024-06-23 Aarni Koskelaserver : fix JSON-Scheme typo (#7975)
2024-06-23 Daniel BeveniusFix typo in llama_set_embeddings comment (#8077)
2024-06-23 slarenfix CI failures (#8066)
2024-06-23 0cc4mRefactor Vulkan backend to allow multiple contexts...
2024-06-22 Clint HerronRemoving extra blank lines that were breaking Lint...
2024-06-22 Xuan Son Nguyencvector: fix CI + correct help message (#8064)
2024-06-22 HatsuneMikuUwU33cvector-generator: Moe Moe Fixie-Fixie for Lots of...
2024-06-22 0xspringtimeconvert-hf : change assert to exception (#8015)
2024-06-22 ddh0Update llama-quantize ppl/file size output from LLaMA...
2024-06-22 Clint HerronJSON Schema to GBNF integration tests (#7790)
2024-06-21 k.h.laivulkan: detect multiple devices by deviceUUID instead...
2024-06-21 Eveggml : AVX IQ quants (#7845)
2024-06-21 Georgi Gerganovllama : optimize long word tokenization with WPM (...
2024-06-21 Douglas Hanleyllama : allow pooled embeddings on any model (#7477)
2024-06-21 Shuichi Tsutsumiswiftui : enable stream updating (#7754)
2024-06-20 Hamdoud Hakemrequirements : Bump torch and numpy for python3.12...
2024-06-20 Hamdoud Hakemconvert-hf : Fix the encoding in the convert-hf-to...
2024-06-20 Johannes Gäßlercommon: fix warning (#8036)
2024-06-20 luoyu-intel[SYCL] Fix windows build and inference (#8003)
2024-06-20 Johannes GäßlerCUDA: stream-k decomposition for MMQ (#8018)
2024-06-20 Michael de... metal : fix `ggml_metal_supports_op` for BF16 (#8021)
2024-06-19 sasha0552server : fix smart slot selection (#8020)
2024-06-19 Michael de... un-ignore `build-info.cmake` and `build-info.sh` (...
2024-06-19 slarenggml : synchronize threads using barriers (#7993)
2024-06-19 Georgi Gerganovcodecov : remove (#8004)
2024-06-19 Meng, Hengyu[SYCL] refactor (#6408)
2024-06-18 jaime-m-ptokenizer : BPE fixes (#7530)
2024-06-18 Sigbjørn SkjæretOnly use FIM middle token if it exists (#7648)
2024-06-18 jojorneFix no gcc pragma on Windows (#7751)
2024-06-18 Ulrich DrepperAllow compiling with CUDA without CUDA runtime installe...
2024-06-18 Frank Maichore: clean useless beam search param (#7985)
2024-06-18 Abheek Gulatireadme : update UI list (#7943)
2024-06-18 Georgi Gerganovggml : sync
2024-06-18 Georgi Gerganovwhisper : use ggml_backend_sched (whisper/2239)
2024-06-17 Ștefan-Gabriel... update: support Qwen2-57B-A14B (#7835)
2024-06-17 Srihari-mcwMake updates to type cast based on compiler instead...
2024-06-17 Georgi Gerganovllama : disable FA if KV head size do not match (#7982)
2024-06-17 Bryan HonofAdd Nix and Flox install instructions (#7899)
2024-06-17 slarensched : offload_op also requires supports_op (#7977)
2024-06-17 Frank Maifix: divide 0 exception in mamba (#7932)
2024-06-17 Markus TavenrathImplement non-mapped async IO for CUDA on Windows....
2024-06-17 Georgi Gerganovrpc : fix load/store misaligned addresses (#7948)
2024-06-17 Briangguf-dump.py: add --markdown dump output (#7853)
2024-06-17 Neo Zhang[SYCL] Update README-sycl.md for Chapter "Recommended...
2024-06-16 Calvin LaurensonAdd support for sqrt on CUDA (#7953)
2024-06-16 Georgi Gerganovcuda : fix bounds check for src0 rows in MMVQ kernel...
2024-06-16 Hong Bo PENGggml : fix and optimize ppc64le (ggml/849)
2024-06-16 Daniel Beveniusggml : remove duplicate include of ggml-common.h (ggml...
2024-06-16 Georgi Gerganovflake.lock: Update (#7951)
2024-06-16 Georgi Gerganovunicode : avoid char32_t (#7957)
2024-06-16 hopkins385readme : update UI list [no ci] (#7958)
2024-06-16 Georgi Gerganovggml : fix handling of zero blocks in IQ quants (#7955)
2024-06-16 Georgi Gerganovgithub : update pr template
2024-06-16 0cc4mVulkan Shader Refactor, Memory Debugging Option (#7947)
2024-06-15 Xuan Son NguyenAdd `cvector-generator` example (#7514)
2024-06-15 Meng, Hengyu[SYCL] remove global variables (#7710)
2024-06-14 olexiybci : fix macos x86 build (#7940)
next