2024-02-07 |
Ben Williams | readme : update ui list (#5354) |
commit | commitdiff | tree |
2024-02-07 |
runfuture | llama : add MiniCPM support (#5346) |
commit | commitdiff | tree |
2024-02-07 |
Justin Parker | server : update `/props` with "total_slots" value ... |
commit | commitdiff | tree |
2024-02-07 |
Sang-Kil Park | convert : fix TypeError on GPT-2 vocab.json (#5288) |
commit | commitdiff | tree |
2024-02-06 |
Alexey Parfenov | server : remove model.json endpoint (#5371) |
commit | commitdiff | tree |
2024-02-06 |
Johannes Gäßler | CUDA: mul_mat_vec_q max. batch size 8 -> 4 (#5370) |
commit | commitdiff | tree |
2024-02-06 |
Kawrakow | Update README.md (#5366) |
commit | commitdiff | tree |
2024-02-06 |
Kawrakow | Slight quantization improvement for Q4_K and Q5_K ... |
commit | commitdiff | tree |
2024-02-06 |
BarfingLemurs | readme : add phi, orion 14b, internlm2, and yi-VL to... |
commit | commitdiff | tree |
2024-02-06 |
Johannes Gäßler | CUDA: mul_mat_vec_q for batch sizes > 1 (#5351) |
commit | commitdiff | tree |
2024-02-06 |
Justin Parker | server : include total "num_slots" in props endpoint... |
commit | commitdiff | tree |
2024-02-06 |
Michael Coppola | server : add `dynatemp_range` and `dynatemp_exponent... |
commit | commitdiff | tree |
2024-02-06 |
Niall Coates | server : various fixes for the prompt field in /complet... |
commit | commitdiff | tree |
2024-02-06 |
Georgi Gerganov | py : handle byte tokens in `get_token_type` (#5341) |
commit | commitdiff | tree |
2024-02-05 |
Johannes Gäßler | make: Use ccache for faster compilation (#5318) |
commit | commitdiff | tree |
2024-02-05 |
Johannes Gäßler | README: updated introduction (#5343) |
commit | commitdiff | tree |
2024-02-05 |
Kawrakow | ggml : make use of ggml-quants.h possible in C++ code... |
commit | commitdiff | tree |
2024-02-05 |
Dr. Tom Murphy... | ggml : avoid duplicating function calls using MIN/MAX... |
commit | commitdiff | tree |
2024-02-05 |
Kawrakow | iq3_xxs: quards for the no-imatrix situation (#5334) |
commit | commitdiff | tree |
2024-02-05 |
Guoteng | py : fix internlm2-hf convert to gguf (#5305) |
commit | commitdiff | tree |
2024-02-05 |
Kawrakow | iq2_xxs: tune quantization (#5320) |
commit | commitdiff | tree |
2024-02-05 |
Alexey Parfenov | server : allow to get default generation settings for... |
commit | commitdiff | tree |
2024-02-05 |
l3utterfly | common : add dynamic temperature parameters to main... |
commit | commitdiff | tree |
2024-02-05 |
Georgi Gerganov | scripts : fix typos, cleanup (#5303) |
commit | commitdiff | tree |
2024-02-05 |
Нияз Гарифзянов | scripts : add non-interactive server-llm.sh (#5303) |
commit | commitdiff | tree |
2024-02-05 |
chiranko | readme : add CodeShell models to the supported models... |
commit | commitdiff | tree |
2024-02-05 |
AidanBeltonS | [SYCL] Fix cpy with dims of 3 (#5289) |
commit | commitdiff | tree |
2024-02-04 |
github-actions... | flake.lock: Update |
commit | commitdiff | tree |
2024-02-04 |
Kawrakow | Adding some imatrix tools (#5302) |
commit | commitdiff | tree |
2024-02-04 |
Welby Seely | cmake : use set() for LLAMA_WIN_VER (#5298) |
commit | commitdiff | tree |
2024-02-03 |
Johannes Gäßler | make: add nvcc info print (#5310) |
commit | commitdiff | tree |
2024-02-03 |
Johannes Gäßler | make: fix nvcc optimization flags for host code (#5309) |
commit | commitdiff | tree |
2024-02-03 |
Martin Schwaighofer | add Vulkan support to Nix flake |
commit | commitdiff | tree |
2024-02-03 |
0cc4m | Vulkan Intel Fixes, Optimizations and Debugging Flags... |
commit | commitdiff | tree |
2024-02-03 |
Michael Klimenko | refactor : switch to emplace_back to avoid extra object... |
commit | commitdiff | tree |
2024-02-03 |
Jared Van Bortel | YaRN : store rope scaling type as int32_t in memory... |
commit | commitdiff | tree |
2024-02-03 |
BADR | readme : add tenere in the ui tools list (#5284) |
commit | commitdiff | tree |
2024-02-03 |
AidanBeltonS | Fix im2col with 32fp (#5286) |
commit | commitdiff | tree |
2024-02-02 |
kalomaze | perplexity : fix KL divergence calculations on Windows... |
commit | commitdiff | tree |
2024-02-02 |
Georgi Gerganov | scripts : parse wtype in server-llm.sh (#5167) |
commit | commitdiff | tree |
2024-02-02 |
Mirror Azure | py : add check for '.attn.masked_bias' layers to GPT2mo... |
commit | commitdiff | tree |
2024-02-02 |
AidanBeltonS | Tidy ggml-sycl (#5261) |
commit | commitdiff | tree |
2024-02-02 |
Xuan Son Nguyen | docker : add build for SYCL, Vulkan + update readme... |
commit | commitdiff | tree |
2024-02-02 |
Meng, Hengyu | [SYCL] get MAX_MEM_ALLOC from device property (#5270) |
commit | commitdiff | tree |
2024-02-02 |
Neo Zhang Jianyu | [SYCL] update guide of SYCL backend (#5254) |
commit | commitdiff | tree |
2024-02-02 |
Ian Bull | llama : fix memory leak in llama_batch_free (#5252) |
commit | commitdiff | tree |
2024-02-01 |
Neo Zhang Jianyu | add --no-mmap in llama-bench (#5257) |
commit | commitdiff | tree |
2024-02-01 |
0cc4m | Vulkan Phi Fix for AMD Proprietary Drivers (#5260) |
commit | commitdiff | tree |
2024-02-01 |
slaren | cuda : fix LLAMA_CUDA_F16 (#5262) |
commit | commitdiff | tree |
2024-02-01 |
Ali Nehzat | make : generate .a library for static linking (#5205) |
commit | commitdiff | tree |
2024-02-01 |
Guoteng | llama : support InternLM2 (#5184) |
commit | commitdiff | tree |
2024-01-31 |
Eve | Fix broken Vulkan Cmake (properly) (#5230) |
commit | commitdiff | tree |
2024-01-31 |
Georgi Gerganov | llama : reorder build_orion() at correct place (#5118) |
commit | commitdiff | tree |
2024-01-31 |
Georgi Gerganov | llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU... |
commit | commitdiff | tree |
2024-01-31 |
Georgi Gerganov | metal : add im2col F32 dst support (#5132) |
commit | commitdiff | tree |
2024-01-31 |
JidongZhang-THU | llava : add MobileVLM support (#5132) |
commit | commitdiff | tree |
2024-01-31 |
Neo Zhang Jianyu | format license text, restore apache license by legal... |
commit | commitdiff | tree |
2024-01-31 |
slaren | ggml : limit n_threads to the max n_tasks (#5238) |
commit | commitdiff | tree |
2024-01-31 |
0cc4m | Vulkan Fixes (#5223) |
commit | commitdiff | tree |
2024-01-31 |
Yiming Cui | Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231) |
commit | commitdiff | tree |
2024-01-31 |
Neo Zhang Jianyu | support SYCL backend windows build (#5208) |
commit | commitdiff | tree |
2024-01-31 |
Jared Van Bortel | kompute : llama-bench support and ggml_cpu_has_kompute... |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | Revert "server : change deps.sh xxd files to string... |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | server : fix context shift (#5195) |
commit | commitdiff | tree |
2024-01-30 |
JohnnyB | server : change deps.sh xxd files to string literals... |
commit | commitdiff | tree |
2024-01-30 |
Kawrakow | ggml : fix IQ3_XXS on Metal (#5219) |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | sync : ggml (#0) |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | gguf : fix comparison (ggml/715) |
commit | commitdiff | tree |
2024-01-30 |
John Balis | `ggml_cuda_cpy` support for 4d tensors and float16... |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | gguf : add input validation, prevent integer overflows... |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | ci : fix yolo URLs + fix metal capture (ggml/712) |
commit | commitdiff | tree |
2024-01-30 |
Jack Mousseau | metal : add debug capture backend function (ggml/694) |
commit | commitdiff | tree |
2024-01-30 |
Kawrakow | Faster AVX2 dot product for IQ2_XS (#5187) |
commit | commitdiff | tree |
2024-01-30 |
Kawrakow | SOTA 3-bit quants (#5196) |
commit | commitdiff | tree |
2024-01-30 |
0cc4m | Vulkan Windows APU Memory Handling (#5199) |
commit | commitdiff | tree |
2024-01-30 |
Vladimir Malyutin | quantize : fix typo (#5211) |
commit | commitdiff | tree |
2024-01-30 |
divinity76 | main : allow empty --prompt-cache file (#5176) |
commit | commitdiff | tree |
2024-01-30 |
Romain Neutron | readme : minor (#5204) |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | readme : update hot topics |
commit | commitdiff | tree |
2024-01-30 |
Wu Jian Ping | server : improve README (#5209) |
commit | commitdiff | tree |
2024-01-29 |
Paul Tsochantaris | ggml alloc: Fix for null dereference on alloc failure... |
commit | commitdiff | tree |
2024-01-29 |
Jared Van Bortel | kompute : fix fallback to CPU (#5201) |
commit | commitdiff | tree |
2024-01-29 |
Jared Van Bortel | Nomic Vulkan backend (#4456) |
commit | commitdiff | tree |
2024-01-29 |
divinity76 | fix typo "RLIMIT_MLOCK" (#5175) |
commit | commitdiff | tree |
2024-01-29 |
Wu Jian Ping | server : embeddings compatibility for OpenAI (#5190) |
commit | commitdiff | tree |
2024-01-29 |
Georgi Gerganov | py : fix except (#5194) |
commit | commitdiff | tree |
2024-01-29 |
Sang-Kil Park | py : improve BPE tokenizer support (#5189) |
commit | commitdiff | tree |
2024-01-29 |
slaren | ggml : add max buffer sizes to opencl and metal backend... |
commit | commitdiff | tree |
2024-01-29 |
Eve | cmake : fix Vulkan build (#5182) |
commit | commitdiff | tree |
2024-01-28 |
Paul Tsochantaris | metal : free metal objects (#5161) |
commit | commitdiff | tree |
2024-01-28 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-01-28 |
Georgi Gerganov | ggml : minor type fix (int64_t -> size_t) |
commit | commitdiff | tree |
2024-01-28 |
0cc4m | ggml : add Vulkan backend (#2059) |
commit | commitdiff | tree |
2024-01-28 |
Abhilash Majumder | ggml : add unified SYCL backend for Intel GPUs (#2690) |
commit | commitdiff | tree |
2024-01-28 |
Georgi Gerganov | flake.lock: Update (#5162) |
commit | commitdiff | tree |
2024-01-28 |
Johannes Gäßler | Apply min_p to unsorted tokens (#5115) |
commit | commitdiff | tree |
2024-01-28 |
Johannes Gäßler | Tests for min_p, sampling queue (#5147) |
commit | commitdiff | tree |
2024-01-28 |
Marcus Dunn | readme : add link to rust bindings (#5148) |
commit | commitdiff | tree |
2024-01-28 |
sharpHL | llama : add support for Orion-14B (#5118) |
commit | commitdiff | tree |
2024-01-28 |
Kyle Mistele | docker : add server-first container images (#5157) |
commit | commitdiff | tree |
next |