]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-06-23 slarenfix CI failures (#8066)
2024-06-23 0cc4mRefactor Vulkan backend to allow multiple contexts...
2024-06-22 Clint HerronRemoving extra blank lines that were breaking Lint...
2024-06-22 Xuan Son Nguyencvector: fix CI + correct help message (#8064)
2024-06-22 HatsuneMikuUwU33cvector-generator: Moe Moe Fixie-Fixie for Lots of...
2024-06-22 0xspringtimeconvert-hf : change assert to exception (#8015)
2024-06-22 ddh0Update llama-quantize ppl/file size output from LLaMA...
2024-06-22 Clint HerronJSON Schema to GBNF integration tests (#7790)
2024-06-21 k.h.laivulkan: detect multiple devices by deviceUUID instead...
2024-06-21 Eveggml : AVX IQ quants (#7845)
2024-06-21 Georgi Gerganovllama : optimize long word tokenization with WPM (...
2024-06-21 Douglas Hanleyllama : allow pooled embeddings on any model (#7477)
2024-06-21 Shuichi Tsutsumiswiftui : enable stream updating (#7754)
2024-06-20 Hamdoud Hakemrequirements : Bump torch and numpy for python3.12...
2024-06-20 Hamdoud Hakemconvert-hf : Fix the encoding in the convert-hf-to...
2024-06-20 Johannes Gäßlercommon: fix warning (#8036)
2024-06-20 luoyu-intel[SYCL] Fix windows build and inference (#8003)
2024-06-20 Johannes GäßlerCUDA: stream-k decomposition for MMQ (#8018)
2024-06-20 Michael de... metal : fix `ggml_metal_supports_op` for BF16 (#8021)
2024-06-19 sasha0552server : fix smart slot selection (#8020)
2024-06-19 Michael de... un-ignore `build-info.cmake` and `build-info.sh` (...
2024-06-19 slarenggml : synchronize threads using barriers (#7993)
2024-06-19 Georgi Gerganovcodecov : remove (#8004)
2024-06-19 Meng, Hengyu[SYCL] refactor (#6408)
2024-06-18 jaime-m-ptokenizer : BPE fixes (#7530)
2024-06-18 Sigbjørn SkjæretOnly use FIM middle token if it exists (#7648)
2024-06-18 jojorneFix no gcc pragma on Windows (#7751)
2024-06-18 Ulrich DrepperAllow compiling with CUDA without CUDA runtime installe...
2024-06-18 Frank Maichore: clean useless beam search param (#7985)
2024-06-18 Abheek Gulatireadme : update UI list (#7943)
2024-06-18 Georgi Gerganovggml : sync
2024-06-18 Georgi Gerganovwhisper : use ggml_backend_sched (whisper/2239)
2024-06-17 Ștefan-Gabriel... update: support Qwen2-57B-A14B (#7835)
2024-06-17 Srihari-mcwMake updates to type cast based on compiler instead...
2024-06-17 Georgi Gerganovllama : disable FA if KV head size do not match (#7982)
2024-06-17 Bryan HonofAdd Nix and Flox install instructions (#7899)
2024-06-17 slarensched : offload_op also requires supports_op (#7977)
2024-06-17 Frank Maifix: divide 0 exception in mamba (#7932)
2024-06-17 Markus TavenrathImplement non-mapped async IO for CUDA on Windows....
2024-06-17 Georgi Gerganovrpc : fix load/store misaligned addresses (#7948)
2024-06-17 Briangguf-dump.py: add --markdown dump output (#7853)
2024-06-17 Neo Zhang[SYCL] Update README-sycl.md for Chapter "Recommended...
2024-06-16 Calvin LaurensonAdd support for sqrt on CUDA (#7953)
2024-06-16 Georgi Gerganovcuda : fix bounds check for src0 rows in MMVQ kernel...
2024-06-16 Hong Bo PENGggml : fix and optimize ppc64le (ggml/849)
2024-06-16 Daniel Beveniusggml : remove duplicate include of ggml-common.h (ggml...
2024-06-16 Georgi Gerganovflake.lock: Update (#7951)
2024-06-16 Georgi Gerganovunicode : avoid char32_t (#7957)
2024-06-16 hopkins385readme : update UI list [no ci] (#7958)
2024-06-16 Georgi Gerganovggml : fix handling of zero blocks in IQ quants (#7955)
2024-06-16 Georgi Gerganovgithub : update pr template
2024-06-16 0cc4mVulkan Shader Refactor, Memory Debugging Option (#7947)
2024-06-15 Xuan Son NguyenAdd `cvector-generator` example (#7514)
2024-06-15 Meng, Hengyu[SYCL] remove global variables (#7710)
2024-06-14 olexiybci : fix macos x86 build (#7940)
2024-06-14 Johannes GäßlerCUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)
2024-06-14 Georgi Gerganovmetal : utilize max shared memory for mul_mat_id (...
2024-06-14 Radoslav Gerganovllama-bench : fix RPC indication (#7936)
2024-06-14 Sigbjørn Skjæretllama : more checks before assuming FIM tokens (#7644)
2024-06-14 Elaineconvert : add Poro-34B-chat tokenizer support (#7713)
2024-06-13 Radoslav Gerganovrpc : fix ggml_backend_rpc_supports_buft() (#7918)
2024-06-13 Galunidreadme : Remove outdated instructions from README.md...
2024-06-13 slarenmove BLAS to a separate backend (#6210)
2024-06-12 Olivier Chafik`build`: rename main → llama-cli, server → llama-server...
2024-06-12 Johannes GäßlerCUDA: fix broken oob check for FA vec f32 kernel (...
2024-06-12 Georgi Gerganovtests : add non-cont unary tests (#7857)
2024-06-12 Georgi Gerganovggml : improve ggml_is_contiguous logic (#7856)
2024-06-12 Georgi Gerganovserver : restore numeric prompts (#7883)
2024-06-12 Meng, Hengyuupdate intel docker oneapi-basekit to 2024.1.1-devel...
2024-06-12 Patrice FerletFix a typo and add Fedora 40 pacakge to install for...
2024-06-11 k.h.laivulkan: select only one device for single gpu with...
2024-06-11 0cc4mUpdate Vulkan RoPE implementation (#7818)
2024-06-11 Deven Mistryfix broken link in pr template (#7880) [no ci]
2024-06-11 Briangithub: move PR template to .github/ root (#7868)
2024-06-11 Johannes Gäßlerllama-bench: more compact markdown tables (#7879)
2024-06-11 Georgi Gerganovtests : check the Python version (#7872)
2024-06-11 Johannes GäßlerCUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)...
2024-06-11 slarenfix CUDA CI by using a windows-2019 image (#7861)
2024-06-11 Olivier Chafikjson: refine constraint for whitespace to avoid runaway...
2024-06-11 Olivier Chafik`json`: document schema conversion in GBNF readme,...
2024-06-10 Jared Van Bortelcmake : fix CMake requirement for CUDA (#7821)
2024-06-10 slarenci : try win-2019 on server windows test (#7854)
2024-06-10 Georgi Gerganovexamples : remove --instruct remnants (#7846)
2024-06-10 Georgi Gerganovserver : improve "prompt" handling (#7847)
2024-06-10 Johannes GäßlerCUDA: use tensor cores for MMQ (#7676)
2024-06-10 Ben Ashbaughuse the correct SYCL context for host USM allocations...
2024-06-09 Georgi Gerganovflake.lock: Update (#7838)
2024-06-09 Georgi Gerganovimatrix : handle partial entries (#7833)
2024-06-09 Nicolás Pérezdocs: Added initial PR template with directions for...
2024-06-09 mgroeber9110server: do not remove whitespace at the start of a...
2024-06-09 Johannes GäßlerCUDA: revise q8_1 data layout for mul_mat_q (#7824)
2024-06-09 sasha0552convert-hf : set the model name based on cli arg, if...
2024-06-09 compiladeconvert-hf : match model part name prefix and suffix...
2024-06-09 compiladegguf-py : decouple adding metadata from writing in...
2024-06-08 slarenRevert "[SYCL] Update rpc-server.cpp to include SYCL...
2024-06-08 Olivier Chafikurl: save -mu downloads to new cache location (#7826)
2024-06-08 sasha0552server : smart slot selection using Longest Common...
2024-06-07 slarenvulkan : reuse parent extra for views (#7806)
2024-06-07 Christian Zhou... gguf-split : change binary multi-byte units to decimal...
2024-06-07 intelmattcmake : fix BUILD_SHARED_LIBS=ON build (#7784)
next