]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-03-01 Georgi Gerganovserver : remove api_like_OAI.py proxy script (#5808)
2024-03-01 ddpasaggml-vulkan: fix VULKAN_CHECK_RESULTS flag, which was...
2024-03-01 kunal-vaishnavigemma : fix bfloat16 -> float16 conversion issue (...
2024-03-01 Miwa / Ensancommon : fix flag `--logits-all` to `--all-logits`...
2024-03-01 Pierrick Hymbertllama : cleanup unused mmq flags (#5772)
2024-03-01 Douglas Hanleyunicode : switch to multimap based nfd_map (#5799)
2024-03-01 Pierrick Hymbertserver: allow to override threads server pool with...
2024-03-01 Eveci : add Ubuntu 22 Vulkan CI run (#5789)
2024-03-01 Georgi Gerganovserver : fix newlines in help (#5785)
2024-03-01 AidanBeltonS[SYCL] Use batched mul_mat pathway (#5591)
2024-02-29 Xuan Son NguyenServer: normalize naming (#5779)
2024-02-29 Marcus Dunnllama : constified `llama_set_state_data`'s `src` ...
2024-02-28 Georgi Gerganovci : reduce 3b ppl chunks to 1 to avoid timeout (#5771)
2024-02-28 Evemake portability_enumeration_ext apple only (#5757)
2024-02-28 Georgi Gerganovllama : remove deprecated API (#5770)
2024-02-28 Georgi Gerganovawq-py : remove (#5768)
2024-02-28 Georgi Gerganovsync : ggml
2024-02-28 slarenadd google magika inference example (ggml/748)
2024-02-28 UEXTM.comIntroduce backend GUIDs (ggml/743)
2024-02-28 Xuan Son Nguyenserver : hit Ctrl+C twice to exit (#5734)
2024-02-28 compiladellama : fix non-quantization of expert gating tensors...
2024-02-28 Douglas Hanleyllama : improve BERT tokenization (#5740)
2024-02-28 Daniel Beveniusreadme : add link to LLaVA 1.6 models (#5758)
2024-02-28 Jorge Aserver : add "/chat/completions" alias for "/v1/.....
2024-02-28 Kawrakowggml : make i-quants work with super-blocks of 64 ...
2024-02-27 KawrakowAttempt to fix android build (#5752)
2024-02-27 KawrakowIQ4_XS: a 4.25 bpw quantization (#5747)
2024-02-27 Engininja2cuda : replace remaining shfl_xor with calls to warp_re...
2024-02-27 Engininja2ggml-quants : fix avx2 iq1_s vec_dot when compiled...
2024-02-27 Georgi Gerganovllama : fix defrag bugs + add parameter (#5735)
2024-02-27 le.changMakefile: use variables for cublas (#5689)
2024-02-26 Xuan Son Nguyenfix server hangs on empty prompt (#5733)
2024-02-26 KawrakowAdding IQ2_S and IQ2_M to complete coverage of the...
2024-02-26 Johannes GäßlerCUDA: fix DEBUG_CUDA_MALLOC (#5729)
2024-02-26 Artemreadme : update ui list (#5731)
2024-02-26 AidanBeltonS[SYCL] Add support for soft_max ALiBi (#5639)
2024-02-26 Georgi Gerganovunicode : reuse iterator (#5726)
2024-02-26 Pierrick Hymbertserver: CI fix trailing space (#5728)
2024-02-26 Pierrick Hymbertserver: CI tests reduce build matrix (#5725)
2024-02-26 Georgi Gerganovllama : fix Gemma rope type (#5691)
2024-02-25 github-actions... flake.lock: Update
2024-02-25 Pierrick Hymbertserver: tests - slow inference causes timeout on the...
2024-02-25 Pierrick Hymbertserver: docs - refresh and tease a little bit more...
2024-02-25 Georgi Gerganovllama : refactor k-shift implementation + KV defragment...
2024-02-25 compiladeserver : fix crash when system prompt is bigger than...
2024-02-25 Radosław Grytaggml-quants : provide ggml_vqtbl1q_u8 for 64bit compati...
2024-02-25 kwin1412make : fix nvcc version is empty (#5713)
2024-02-25 Ashok Gelalreadme : add Msty to UI list (#5618)
2024-02-25 Pierrick Hymbertserver: logs - unified format and --log-format option...
2024-02-25 Pierrick Hymbertserver: concurrency fix + monitoring - add /metrics...
2024-02-25 Radosław Grytacmake : fix compilation for Android armeabi-v7a (#5702)
2024-02-25 Georgi Gerganovcode : normalize enum names (#5697)
2024-02-25 Anas Ahouzipy : fix StableLM conversion after config.json changes...
2024-02-24 Pierrick Hymbertserver: continue to update other slots on embedding...
2024-02-24 KawrakowIQ3_S: a much better alternative to Q3_K (#5676)
2024-02-24 Pierrick Hymbertserver: init functional tests (#5566)
2024-02-23 AlpinDaleserver : add KV cache quantization options (#5684)
2024-02-23 Jared Van Bortelconvert : fix missing ftype for gemma (#5690)
2024-02-22 Jared Van Bortelmpt : do not duplicate token_embd.weight on disk (...
2024-02-22 Georgi Gerganovgemma : use more bits for the token_embd.weight tensor...
2024-02-22 Georgi Gerganovpy : add Gemma conversion from HF models (#5647)
2024-02-22 Georgi Gerganovggml : always define ggml_fp16_t as uint16_t (#5666)
2024-02-22 Georgi Gerganovsync : ggml
2024-02-22 Georgi Gerganovggml : 32-bit arm compat (whisper/1891)
2024-02-22 Someonenix: init singularity and docker images (#5056)
2024-02-22 Georgi Gerganovpy : minor fixes (#5668)
2024-02-22 Xuan Son NguyenAdd Gemma chat template (#5665)
2024-02-22 Someoneworkflows: nix: hardcode cachix ids, build unconditiona...
2024-02-22 Georgi Gerganovminor : fix trailing whitespace (#5638)
2024-02-22 Georgi Gerganovreadme : update hot topics
2024-02-22 Xuan Son Nguyenserver : fallback to chatml, add AlphaMonarch chat...
2024-02-22 Alexey Parfenovserver : clarify some params in the docs (#5640)
2024-02-22 Dat Quoc Nguyenmpt : add optional bias tensors (#5638)
2024-02-21 slarenllama : fix loading models with shared tok_embd and...
2024-02-21 Xuan Son NguyenAdd docs for llama_chat_apply_template (#5645)
2024-02-21 slarenllama : fix session save/load with quantized KV (#5649)
2024-02-21 slarengemma : allow offloading the output tensor (#5646)
2024-02-21 Jared Van Bortelexamples : do not assume BOS when shifting context...
2024-02-21 Georgi Gerganovsync : ggml
2024-02-21 Pierrick Hymbertserver: health: fix race condition on slots data using...
2024-02-21 Ettore Di Giacintoreadme : add LocalAI to the availables UI (#5629)
2024-02-21 Georgi Gerganovsync : ggml (#5633)
2024-02-21 Georgi Gerganovreadme : update hot topics
2024-02-21 Daniel Beveniusllava : add --skip-unknown to 1.6 convert.py (#5632)
2024-02-21 postmastersllama : add `gemma` model (#5631)
2024-02-21 Meng, Hengyu[SYCL] conext add name (#5624)
2024-02-21 KawrakowIQ4_NL: 4-bit non-linear quants with blocks of 32 ...
2024-02-20 CJ Paisserver : support llava 1.6 (#5553)
2024-02-20 slarenmake : fix debug build with CUDA (#5616)
2024-02-20 Daniel Beveniusllava : add explicit instructions for llava-1.6 (#5611)
2024-02-20 Xuan Son NguyenServer: use llama_chat_apply_template (#5593)
2024-02-20 Dane Madsenreadme : update UI list (#5605)
2024-02-20 Haoxiang Feimetal : add build system support for embedded metal...
2024-02-20 Pierrick Hymbertserver : health endpoint configurable failure on no...
2024-02-20 AidanBeltonSUpdate ggml_sycl_op_mul_mat_vec_q (#5502)
2024-02-19 Mathijs de... nix: now that we can do so, allow MacOS to build Vulkan...
2024-02-19 0cc4mEnable Vulkan MacOS CI
2024-02-19 0cc4mRefactor validation and enumeration platform checks...
2024-02-19 0cc4mAdd check for VK_KHR_portability_enumeration for Molten...
2024-02-19 Mathijs de... Add preprocessor checks for Apple devices.
next