]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-03-22 Georgi Gerganovcommon : default --hf-file to --model (#6234)
2024-03-22 fraxy-vconvert-llama2c-to-ggml : enable conversion of GQA...
2024-03-22 Kawrakowquantize: options for output and token embedding tensor...
2024-03-22 Pierrick Hymbertllama_model_loader: support multiple split/shard GGUFs...
2024-03-22 Minsoo Cheongci: apply concurrency limit for github workflows (...
2024-03-22 Georgi Gerganovcommon : add HF arg helpers (#6234)
2024-03-22 Nexesenexllama : correction of the attn.v.weight quantization...
2024-03-22 Olivier Chafiktests : conditional python & node json schema tests...
2024-03-22 Olivier Chafikjson-schema-to-grammar : fix order of props + non-str...
2024-03-22 slarencuda : add LLAMA_CUDA_NO_PEER_COPY to workaround broken...
2024-03-22 Xiaoyi Chenreadme : add RecurseChat to the list of UIs (#6219)
2024-03-22 Jan Boonserver : fix n_keep always showing as 0 in response...
2024-03-22 Georgi Gerganovserver : enable continuous batching by default (#6231)
2024-03-22 Georgi Gerganovmetal : proper assert for mat-mat memory alignment...
2024-03-22 Vaibhav Srivastavci : add CURL flag for the mac builds (#6214)
2024-03-22 Georgi Gerganovmetal : pad n_ctx by 32 (#6177)
2024-03-22 Neo Zhang Jianyuadd blog link (#6222)
2024-03-22 DAN™Fix params underscore convert to dash. (#6203)
2024-03-21 Jan Boonserver : update readme doc from `slot_id` to `id_slot...
2024-03-21 slarencuda : disable host register by default (#6206)
2024-03-21 semidarkCorrected typo to wrong file (#6199)
2024-03-21 Georgi Gerganovtests : disable system() calls (#6198)
2024-03-21 slarencuda : fix LLAMA_CUDA_F16 build (#6197)
2024-03-21 Kawrakowggml : same IQ4_NL quantization for CPU/CUDA/Metal...
2024-03-21 Olivier Chafikjson-schema-to-grammar improvements (+ added to server...
2024-03-21 Vaibhav Srivastavci : fix indentation error (#6195)
2024-03-21 Vaibhav Srivastavbuild : add mac pre-build binaries (#6182)
2024-03-21 KawrakowAdd ability to use Q5_0, Q5_1, and IQ4_NL for quantized...
2024-03-21 AidanBeltonSAdd nvidia and amd backends (#6157)
2024-03-21 slarencuda : fix conflict with std::swap (#6186)
2024-03-20 slarencuda : print the returned error when CUDA initializatio...
2024-03-20 Ziang Wullava : update MobileVLM-README.md (#6180)
2024-03-20 Ziang Wullava : add MobileVLM_V2 backup (#6175)
2024-03-20 slarencuda : refactor to remove global resources (#6170)
2024-03-20 Xuan Son NguyenServer: version bump for httplib and json (#6169)
2024-03-20 Georgi Gerganovgitignore : ignore curl-related files
2024-03-20 Georgi Gerganovserver : allow to override -ngl in tests (#6170)
2024-03-20 Georgi GerganovRevert "llava : add a MobileVLM_V2-1.7B backup (#6152)"
2024-03-20 Ziang Wullava : add a MobileVLM_V2-1.7B backup (#6152)
2024-03-20 KarthickServer: Handle n_keep parameter in the request (#6174)
2024-03-20 Jared Van Bortelserver tests : more pythonic process management; fix...
2024-03-20 Neo Zhang Jianyuupdate readme sycl for new update (#6151)
2024-03-20 Abhilash Majumderincrease igpu cluster limit (#6159)
2024-03-19 DAN™Remove undeed header file. (#6158)
2024-03-19 Pierrick Hymbertgguf-split: split and merge gguf per batch of tensors...
2024-03-19 Georgi Gerganovcommon : disable repeat penalties by default (#6127)
2024-03-19 slarenci : exempt some labels from being tagged as stale...
2024-03-19 DAN™common : print usage on '-h' and '--help' (#6145)
2024-03-18 github-actions... flake.lock: Update
2024-03-18 Jared Van Bortelmpt : implement backwards compatiblity with duped outpu...
2024-03-18 Felixclip : fix memory leak (#6138)
2024-03-18 slarenbackend : set max split inputs to GGML_MAX_SRC (#6137)
2024-03-18 Georgi Gerganovci : disable stale issue messages (#6126)
2024-03-18 Georgi Gerganovci : temporary disable sanitizer builds (#6128)
2024-03-18 slarenbackend : offload large batches to GPU (#6083)
2024-03-18 DAN™common : tidy-up argument parsing (#6105)
2024-03-18 Thérenceconvert : add support for CamembertModel architecture...
2024-03-18 Romain Dconvert : use f32 outtype for bf16 tensors (#6106)
2024-03-17 Pierrick Hymbertcommon: llama_load_model_from_url using --model-url...
2024-03-17 Georgi Gerganovci : close all stale issues at once (#6115)
2024-03-17 GainLeeggml:fix finding transfer queue family index error...
2024-03-16 AmirAli Mirianggml : add AVX512F SIMD (#6088)
2024-03-16 Daniel Beveniusgritlm : add initial README.md (#6086)
2024-03-16 Xuan Son Nguyenreadme : add wllama as a wasm binding (#6100)
2024-03-16 DAN™common : refactor nested if causing error C1061 on...
2024-03-16 Pierrick Hymbertci : close inactive issue with workflow (#6053)
2024-03-15 slarenllama : fix Baichuan2 13B (#6092)
2024-03-15 Theia Vogelllama : add support for control vectors (#5970)
2024-03-15 Andrew Canisllama : add Command-R support (#6033)
2024-03-15 Ting Loullava : change API to pure C style for Rust FFI bindgen...
2024-03-15 slarencuda : disable unused cudaLaunchHostFunc code (#6078)
2024-03-15 Neo Zhang Jianyufix set main gpu error (#6073)
2024-03-15 Georgi Gerganovmake : ggml-metal.o depends on ggml.h
2024-03-15 AidanBeltonS[SYCL] Fix non-intel device selection (#6042)
2024-03-15 Ondřej Čertíkgguf : add support for I64 and F64 arrays (#6062)
2024-03-15 Xuan Son Nguyenllama : add Orion chat template (#6066)
2024-03-15 slarenllama-bench : use random tokens to improve accuracy...
2024-03-14 Georgi Gerganovllama : fix integer overflow during quantization (...
2024-03-14 Steve Grubbgguf : fix resource leaks (#6061)
2024-03-14 Ondřej Čertíkgguf-py : bump version to 0.8.0 (#6060)
2024-03-14 Michael Podvitskiyllama : support models without vocabulary (#5798)
2024-03-14 Georgi Gerganovembedding : add EOS token if not present (#899)
2024-03-14 Georgi Gerganovgguf-py : fix dtype check (#6045)
2024-03-14 Jian Liaoreadme : improve readme for Llava-1.6 example (#6044)
2024-03-14 Pierrick Hymbertserver: disable debug release type sanitizer, simplify...
2024-03-14 Georgi Gerganovllama : fix typo
2024-03-14 Michael Podvitskiyllama : optimize defrag moves + fix fragmentation calcu...
2024-03-14 Ondřej Čertíkgguf-py : add support for I8, I16 and I32 (#6045)
2024-03-14 Georgi Gerganovggml : designate enum vals for integer types (#6050)
2024-03-14 Georgi Gerganovembedding : print all resulting embeddings (#899)
2024-03-14 Georgi Gerganovmetal : build metallib + fix embed path (#6015)
2024-03-14 Georgi Gerganovembedding : print cosine similarity (#899)
2024-03-13 Linwei Wangreadme : update details about running llama in Termux...
2024-03-13 Georgi Gerganovreadme : update API changes and hot topics
2024-03-13 Clint Herrongrammar : handle missing "root" node (#6004)
2024-03-13 slarenllama : add pipeline parallelism support (#6017)
2024-03-13 slarentest-backend-ops : skip CPU backend by default (#6028)
2024-03-13 AidanBeltonSUpdate get version (#6025)
2024-03-13 Xuan Son NguyenServer: Use multi-task for embeddings endpoint (#6001)
2024-03-12 slarenci : remove tidy-review (#6021)
next