]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-03-29 Daniel Beveniusllama : remove redundant reshape in build_kv_store...
2024-03-29 Pedro Cuencaconvert : allow conversion of Mistral HF models (#6144)
2024-03-28 Georgi Gerganovreadme : add notice for UI list
2024-03-28 Ouadie EL FAROUKI[SYCL] Revisited & updated SYCL build documentation...
2024-03-28 Jared Van Bortelconvert : refactor vocab selection logic (#6355)
2024-03-28 Ziang Wullava : fix MobileVLM (#6364)
2024-03-28 compiladellama : fix command-r inference when omitting outputs...
2024-03-28 Pierrick Hymbertci: bench: fix master not schedule, fix commit status...
2024-03-28 Ting Sundoc: fix outdated default value of batch size (#6336)
2024-03-28 Eric Zhangserver : stop gracefully on SIGTERM (#6348)
2024-03-28 hutlinix: removed unnessesary indentation
2024-03-28 hutlinix: moved blas availability check to package inputs...
2024-03-28 hutliusing blas.meta.available to check host platform
2024-03-28 hutlionly using explicit blas if hostPlatform is allowed
2024-03-28 Someone Sergenix: .#windows: proper cross-compilation set-up
2024-03-28 Someone Sergenix: package: don't introduce the dependency on python
2024-03-28 hutlinix: .#widnows: init
2024-03-28 Ziang Wudoc: fix typo in MobileVLM-README.md (#6181)
2024-03-28 Neo Zhang Jianyu[SYCL] fix set main gpu crash (#6339)
2024-03-27 Pierrick Hymbertserver: continuous performance monitoring and PR commen...
2024-03-27 Someone Sergenix: ci: dont test cuda and rocm (for now)
2024-03-27 slarenggml : fix bounds checking of zero size views (#6347)
2024-03-27 Georgi Gerganovmake : whitespace
2024-03-27 howlgerembedding : show full embedding for single prompt ...
2024-03-27 AidanBeltonS[SYCL] Fix batched impl for NVidia GPU (#6164)
2024-03-27 KawrakowMake IQ1_M work for QK_K = 64 (#6327)
2024-03-27 Sigbjørn Skjæretcommon : change --no-penalize-nl to --penalize-nl ...
2024-03-27 Georgi Gerganovllama2c : open file as binary (#6332)
2024-03-27 Mateusz Charytoniukreadme : add php api bindings (#6326)
2024-03-27 Eric Zhangserver: public: use relative routes for static files...
2024-03-27 Neo Zhang Jianyu[SYCL] fix no file in win rel (#6314)
2024-03-26 Jared Van Bortelwpm : portable unicode tolower (#6305)
2024-03-26 compiladellama : greatly reduce output buffer memory usage ...
2024-03-26 KawrakowIQ1_M: 1.75 bpw quantization (#6302)
2024-03-26 Pedro Cuencaconvert-hf : fix exception in sentencepiece with added...
2024-03-26 Kawrakowquantize : be able to override metadata by key (#6321)
2024-03-26 Minsoo Cheongembedding : adjust `n_ubatch` value (#6296)
2024-03-26 Jan Boonserver : add `n_discard` parameter (#6300)
2024-03-26 Joseph Stahlnix: make `xcrun` visible in Nix sandbox for precompili...
2024-03-26 slarencuda : rename build flag to LLAMA_CUDA (#6299)
2024-03-25 Christian Köglernix: fix blas support (#6281)
2024-03-25 Kawrakowtests : include IQ2_XXS and IQ2_XS in test-quantize...
2024-03-25 Georgi Gerganovflake.lock: Update (#6266)
2024-03-25 slarencuda : fix LLAMA_CUDA_F16 build (#6298)
2024-03-25 slarencuda : refactor into multiple files (#6269)
2024-03-25 Xuan Son NguyenServer: clean up OAI params parsing function (#6284)
2024-03-25 Neo Zhang Jianyu[SYCL] fix SYCL backend build on windows is break by...
2024-03-25 Minsoo Cheongexamples : add "retrieval" (#6193)
2024-03-25 Justine Tunneyggml : support AVX512VNNI (#6280)
2024-03-24 Rick GFix heap corruption from wmode out-of-bound writes...
2024-03-24 Georgi Gerganovimatrix : fix wname for mul_mat_id ops (#6271)
2024-03-24 Johannes GäßlerFixed lookup compilation issues on Windows (#6273)
2024-03-24 Pierrick Hymbertci : close inactive issue, increase operations per...
2024-03-24 Minsoo Cheongsampling : deduplicated code for probability distributi...
2024-03-24 Meng, Hengyu[SYCL] offload op (#6217)
2024-03-24 Neo Zhang JianyuSupport build win release for SYCL (#6241)
2024-03-23 Jared Van Borteluse _wfopen instead of fopen on Windows (#6248)
2024-03-23 Georgi Gerganovgitignore : gguf-split
2024-03-23 Pierrick Hymbertcommon: llama_load_model_from_url split support (...
2024-03-23 Pierrick Hymbertserver: docs: `--threads` and `--threads`, `--ubatch...
2024-03-23 Julius Arkenbergllama : add grok-1 support (#6204)
2024-03-23 Pierrick Hymbertsplit: add gguf-split in the make build target (#6262)
2024-03-23 Pierrick Hymbertserver: flush stdout after logging in both text and...
2024-03-23 Johannes Gäßlerlookup: complement data from context with general text...
2024-03-22 Georgi Gerganovcommon : default --hf-file to --model (#6234)
2024-03-22 fraxy-vconvert-llama2c-to-ggml : enable conversion of GQA...
2024-03-22 Kawrakowquantize: options for output and token embedding tensor...
2024-03-22 Pierrick Hymbertllama_model_loader: support multiple split/shard GGUFs...
2024-03-22 Minsoo Cheongci: apply concurrency limit for github workflows (...
2024-03-22 Georgi Gerganovcommon : add HF arg helpers (#6234)
2024-03-22 Nexesenexllama : correction of the attn.v.weight quantization...
2024-03-22 Olivier Chafiktests : conditional python & node json schema tests...
2024-03-22 Olivier Chafikjson-schema-to-grammar : fix order of props + non-str...
2024-03-22 slarencuda : add LLAMA_CUDA_NO_PEER_COPY to workaround broken...
2024-03-22 Xiaoyi Chenreadme : add RecurseChat to the list of UIs (#6219)
2024-03-22 Jan Boonserver : fix n_keep always showing as 0 in response...
2024-03-22 Georgi Gerganovserver : enable continuous batching by default (#6231)
2024-03-22 Georgi Gerganovmetal : proper assert for mat-mat memory alignment...
2024-03-22 Vaibhav Srivastavci : add CURL flag for the mac builds (#6214)
2024-03-22 Georgi Gerganovmetal : pad n_ctx by 32 (#6177)
2024-03-22 Neo Zhang Jianyuadd blog link (#6222)
2024-03-22 DAN™Fix params underscore convert to dash. (#6203)
2024-03-21 Jan Boonserver : update readme doc from `slot_id` to `id_slot...
2024-03-21 slarencuda : disable host register by default (#6206)
2024-03-21 semidarkCorrected typo to wrong file (#6199)
2024-03-21 Georgi Gerganovtests : disable system() calls (#6198)
2024-03-21 slarencuda : fix LLAMA_CUDA_F16 build (#6197)
2024-03-21 Kawrakowggml : same IQ4_NL quantization for CPU/CUDA/Metal...
2024-03-21 Olivier Chafikjson-schema-to-grammar improvements (+ added to server...
2024-03-21 Vaibhav Srivastavci : fix indentation error (#6195)
2024-03-21 Vaibhav Srivastavbuild : add mac pre-build binaries (#6182)
2024-03-21 KawrakowAdd ability to use Q5_0, Q5_1, and IQ4_NL for quantized...
2024-03-21 AidanBeltonSAdd nvidia and amd backends (#6157)
2024-03-21 slarencuda : fix conflict with std::swap (#6186)
2024-03-20 slarencuda : print the returned error when CUDA initializatio...
2024-03-20 Ziang Wullava : update MobileVLM-README.md (#6180)
2024-03-20 Ziang Wullava : add MobileVLM_V2 backup (#6175)
2024-03-20 slarencuda : refactor to remove global resources (#6170)
2024-03-20 Xuan Son NguyenServer: version bump for httplib and json (#6169)
2024-03-20 Georgi Gerganovgitignore : ignore curl-related files
next