]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-03-27 slarenggml : fix bounds checking of zero size views (#6347)
2024-03-27 Georgi Gerganovmake : whitespace
2024-03-27 howlgerembedding : show full embedding for single prompt ...
2024-03-27 AidanBeltonS[SYCL] Fix batched impl for NVidia GPU (#6164)
2024-03-27 KawrakowMake IQ1_M work for QK_K = 64 (#6327)
2024-03-27 Sigbjørn Skjæretcommon : change --no-penalize-nl to --penalize-nl ...
2024-03-27 Georgi Gerganovllama2c : open file as binary (#6332)
2024-03-27 Mateusz Charytoniukreadme : add php api bindings (#6326)
2024-03-27 Eric Zhangserver: public: use relative routes for static files...
2024-03-27 Neo Zhang Jianyu[SYCL] fix no file in win rel (#6314)
2024-03-26 Jared Van Bortelwpm : portable unicode tolower (#6305)
2024-03-26 compiladellama : greatly reduce output buffer memory usage ...
2024-03-26 KawrakowIQ1_M: 1.75 bpw quantization (#6302)
2024-03-26 Pedro Cuencaconvert-hf : fix exception in sentencepiece with added...
2024-03-26 Kawrakowquantize : be able to override metadata by key (#6321)
2024-03-26 Minsoo Cheongembedding : adjust `n_ubatch` value (#6296)
2024-03-26 Jan Boonserver : add `n_discard` parameter (#6300)
2024-03-26 Joseph Stahlnix: make `xcrun` visible in Nix sandbox for precompili...
2024-03-26 slarencuda : rename build flag to LLAMA_CUDA (#6299)
2024-03-25 Christian Köglernix: fix blas support (#6281)
2024-03-25 Kawrakowtests : include IQ2_XXS and IQ2_XS in test-quantize...
2024-03-25 Georgi Gerganovflake.lock: Update (#6266)
2024-03-25 slarencuda : fix LLAMA_CUDA_F16 build (#6298)
2024-03-25 slarencuda : refactor into multiple files (#6269)
2024-03-25 Xuan Son NguyenServer: clean up OAI params parsing function (#6284)
2024-03-25 Neo Zhang Jianyu[SYCL] fix SYCL backend build on windows is break by...
2024-03-25 Minsoo Cheongexamples : add "retrieval" (#6193)
2024-03-25 Justine Tunneyggml : support AVX512VNNI (#6280)
2024-03-24 Rick GFix heap corruption from wmode out-of-bound writes...
2024-03-24 Georgi Gerganovimatrix : fix wname for mul_mat_id ops (#6271)
2024-03-24 Johannes GäßlerFixed lookup compilation issues on Windows (#6273)
2024-03-24 Pierrick Hymbertci : close inactive issue, increase operations per...
2024-03-24 Minsoo Cheongsampling : deduplicated code for probability distributi...
2024-03-24 Meng, Hengyu[SYCL] offload op (#6217)
2024-03-24 Neo Zhang JianyuSupport build win release for SYCL (#6241)
2024-03-23 Jared Van Borteluse _wfopen instead of fopen on Windows (#6248)
2024-03-23 Georgi Gerganovgitignore : gguf-split
2024-03-23 Pierrick Hymbertcommon: llama_load_model_from_url split support (...
2024-03-23 Pierrick Hymbertserver: docs: `--threads` and `--threads`, `--ubatch...
2024-03-23 Julius Arkenbergllama : add grok-1 support (#6204)
2024-03-23 Pierrick Hymbertsplit: add gguf-split in the make build target (#6262)
2024-03-23 Pierrick Hymbertserver: flush stdout after logging in both text and...
2024-03-23 Johannes Gäßlerlookup: complement data from context with general text...
2024-03-22 Georgi Gerganovcommon : default --hf-file to --model (#6234)
2024-03-22 fraxy-vconvert-llama2c-to-ggml : enable conversion of GQA...
2024-03-22 Kawrakowquantize: options for output and token embedding tensor...
2024-03-22 Pierrick Hymbertllama_model_loader: support multiple split/shard GGUFs...
2024-03-22 Minsoo Cheongci: apply concurrency limit for github workflows (...
2024-03-22 Georgi Gerganovcommon : add HF arg helpers (#6234)
2024-03-22 Nexesenexllama : correction of the attn.v.weight quantization...
2024-03-22 Olivier Chafiktests : conditional python & node json schema tests...
2024-03-22 Olivier Chafikjson-schema-to-grammar : fix order of props + non-str...
2024-03-22 slarencuda : add LLAMA_CUDA_NO_PEER_COPY to workaround broken...
2024-03-22 Xiaoyi Chenreadme : add RecurseChat to the list of UIs (#6219)
2024-03-22 Jan Boonserver : fix n_keep always showing as 0 in response...
2024-03-22 Georgi Gerganovserver : enable continuous batching by default (#6231)
2024-03-22 Georgi Gerganovmetal : proper assert for mat-mat memory alignment...
2024-03-22 Vaibhav Srivastavci : add CURL flag for the mac builds (#6214)
2024-03-22 Georgi Gerganovmetal : pad n_ctx by 32 (#6177)
2024-03-22 Neo Zhang Jianyuadd blog link (#6222)
2024-03-22 DAN™Fix params underscore convert to dash. (#6203)
2024-03-21 Jan Boonserver : update readme doc from `slot_id` to `id_slot...
2024-03-21 slarencuda : disable host register by default (#6206)
2024-03-21 semidarkCorrected typo to wrong file (#6199)
2024-03-21 Georgi Gerganovtests : disable system() calls (#6198)
2024-03-21 slarencuda : fix LLAMA_CUDA_F16 build (#6197)
2024-03-21 Kawrakowggml : same IQ4_NL quantization for CPU/CUDA/Metal...
2024-03-21 Olivier Chafikjson-schema-to-grammar improvements (+ added to server...
2024-03-21 Vaibhav Srivastavci : fix indentation error (#6195)
2024-03-21 Vaibhav Srivastavbuild : add mac pre-build binaries (#6182)
2024-03-21 KawrakowAdd ability to use Q5_0, Q5_1, and IQ4_NL for quantized...
2024-03-21 AidanBeltonSAdd nvidia and amd backends (#6157)
2024-03-21 slarencuda : fix conflict with std::swap (#6186)
2024-03-20 slarencuda : print the returned error when CUDA initializatio...
2024-03-20 Ziang Wullava : update MobileVLM-README.md (#6180)
2024-03-20 Ziang Wullava : add MobileVLM_V2 backup (#6175)
2024-03-20 slarencuda : refactor to remove global resources (#6170)
2024-03-20 Xuan Son NguyenServer: version bump for httplib and json (#6169)
2024-03-20 Georgi Gerganovgitignore : ignore curl-related files
2024-03-20 Georgi Gerganovserver : allow to override -ngl in tests (#6170)
2024-03-20 Georgi GerganovRevert "llava : add a MobileVLM_V2-1.7B backup (#6152)"
2024-03-20 Ziang Wullava : add a MobileVLM_V2-1.7B backup (#6152)
2024-03-20 KarthickServer: Handle n_keep parameter in the request (#6174)
2024-03-20 Jared Van Bortelserver tests : more pythonic process management; fix...
2024-03-20 Neo Zhang Jianyuupdate readme sycl for new update (#6151)
2024-03-20 Abhilash Majumderincrease igpu cluster limit (#6159)
2024-03-19 DAN™Remove undeed header file. (#6158)
2024-03-19 Pierrick Hymbertgguf-split: split and merge gguf per batch of tensors...
2024-03-19 Georgi Gerganovcommon : disable repeat penalties by default (#6127)
2024-03-19 slarenci : exempt some labels from being tagged as stale...
2024-03-19 DAN™common : print usage on '-h' and '--help' (#6145)
2024-03-18 github-actions... flake.lock: Update
2024-03-18 Jared Van Bortelmpt : implement backwards compatiblity with duped outpu...
2024-03-18 Felixclip : fix memory leak (#6138)
2024-03-18 slarenbackend : set max split inputs to GGML_MAX_SRC (#6137)
2024-03-18 Georgi Gerganovci : disable stale issue messages (#6126)
2024-03-18 Georgi Gerganovci : temporary disable sanitizer builds (#6128)
2024-03-18 slarenbackend : offload large batches to GPU (#6083)
2024-03-18 DAN™common : tidy-up argument parsing (#6105)
2024-03-18 Thérenceconvert : add support for CamembertModel architecture...
next