2024-02-11 |
Georgi Gerganov | ggml : fix compile warnings (unused vars) (#4966) |
commit | commitdiff | tree |
2024-02-11 |
snadampal | ggml : add mmla kernels for quantized GEMM (#4966) |
commit | commitdiff | tree |
2024-02-11 |
Johannes Gäßler | lookup: add print for drafting performance (#5450) |
commit | commitdiff | tree |
2024-02-11 |
Xuan Son Nguyen | server : add llama2 chat template (#5425) |
commit | commitdiff | tree |
2024-02-10 |
Ian Bull | metal : use autoreleasepool to avoid memory leaks ... |
commit | commitdiff | tree |
2024-02-10 |
Georgi Gerganov | scripts : update sync scripts with new backends |
commit | commitdiff | tree |
2024-02-10 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-02-10 |
Michael Podvitskiy | ggml : add abort_callback for cpu backend (ggml/725) |
commit | commitdiff | tree |
2024-02-09 |
Neuman Vong | vulkan: Set limit for task concurrency (#5427) |
commit | commitdiff | tree |
2024-02-09 |
Daniel Bevenius | llava : add requirements.txt and update README.md ... |
commit | commitdiff | tree |
2024-02-09 |
Riley Stewart | server : fix prompt caching for repeated prompts (... |
commit | commitdiff | tree |
2024-02-09 |
Paul Tsochantaris | llama : do not cap thread count when MoE on CPU (#5419) |
commit | commitdiff | tree |
2024-02-09 |
Marko Tasic | readme : add JavaScript/Wasm repo (#5415) |
commit | commitdiff | tree |
2024-02-09 |
Michael Podvitskiy | ggml : fix `error C2078: too many initializers` for... |
commit | commitdiff | tree |
2024-02-09 |
0cc4m | Fix Vulkan crash on APUs with very little device memory... |
commit | commitdiff | tree |
2024-02-08 |
Johannes Gäßler | CUDA: more warps for mmvq on NVIDIA (#5394) |
commit | commitdiff | tree |
2024-02-08 |
slaren | llama : do not print "offloading layers" message in... |
commit | commitdiff | tree |
2024-02-08 |
Abhilash Majumder | Fix f16_sycl cpy call from Arc (#5411) |
commit | commitdiff | tree |
2024-02-08 |
Daniel Bevenius | llava : add missing .py, and fix paths in README.md... |
commit | commitdiff | tree |
2024-02-08 |
Johannes Gäßler | fix trailing whitespace (#5407) |
commit | commitdiff | tree |
2024-02-08 |
runfuture | llama : fix MiniCPM (#5392) |
commit | commitdiff | tree |
2024-02-08 |
Daniel Bevenius | llava: fix typo/formatting in README.md (#5405) |
commit | commitdiff | tree |
2024-02-08 |
Johannes Gäßler | sampling: fix top_k <= 0 (#5388) |
commit | commitdiff | tree |
2024-02-08 |
Georgi Gerganov | tests : .gitignore obj files |
commit | commitdiff | tree |
2024-02-07 |
Michael Podvitskiy | CMAKE_OSX_ARCHITECTURES for MacOS cross compilation... |
commit | commitdiff | tree |
2024-02-07 |
Ebey Abraham | fix typo in readme (#5399) |
commit | commitdiff | tree |
2024-02-07 |
Kamil Tomšík | Add Ava in the list of llama.cpp UIs (#4362) |
commit | commitdiff | tree |
2024-02-07 |
Johannes Gäßler | CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (... |
commit | commitdiff | tree |
2024-02-07 |
Neo Zhang Jianyu | [SYCL] update install make by w64devkit (#5297) |
commit | commitdiff | tree |
2024-02-07 |
Xiao-Yong Jin | llava-cli : always tokenize special tokens (#5382) |
commit | commitdiff | tree |
2024-02-07 |
0cc4m | Basic Vulkan Multi-GPU implementation (#5321) |
commit | commitdiff | tree |
2024-02-07 |
Eve | readme : modernize (#5379) |
commit | commitdiff | tree |
2024-02-07 |
Ben Williams | readme : update ui list (#5354) |
commit | commitdiff | tree |
2024-02-07 |
runfuture | llama : add MiniCPM support (#5346) |
commit | commitdiff | tree |
2024-02-07 |
Justin Parker | server : update `/props` with "total_slots" value ... |
commit | commitdiff | tree |
2024-02-07 |
Sang-Kil Park | convert : fix TypeError on GPT-2 vocab.json (#5288) |
commit | commitdiff | tree |
2024-02-06 |
Alexey Parfenov | server : remove model.json endpoint (#5371) |
commit | commitdiff | tree |
2024-02-06 |
Johannes Gäßler | CUDA: mul_mat_vec_q max. batch size 8 -> 4 (#5370) |
commit | commitdiff | tree |
2024-02-06 |
Kawrakow | Update README.md (#5366) |
commit | commitdiff | tree |
2024-02-06 |
Kawrakow | Slight quantization improvement for Q4_K and Q5_K ... |
commit | commitdiff | tree |
2024-02-06 |
BarfingLemurs | readme : add phi, orion 14b, internlm2, and yi-VL to... |
commit | commitdiff | tree |
2024-02-06 |
Johannes Gäßler | CUDA: mul_mat_vec_q for batch sizes > 1 (#5351) |
commit | commitdiff | tree |
2024-02-06 |
Justin Parker | server : include total "num_slots" in props endpoint... |
commit | commitdiff | tree |
2024-02-06 |
Michael Coppola | server : add `dynatemp_range` and `dynatemp_exponent... |
commit | commitdiff | tree |
2024-02-06 |
Niall Coates | server : various fixes for the prompt field in /complet... |
commit | commitdiff | tree |
2024-02-06 |
Georgi Gerganov | py : handle byte tokens in `get_token_type` (#5341) |
commit | commitdiff | tree |
2024-02-05 |
Johannes Gäßler | make: Use ccache for faster compilation (#5318) |
commit | commitdiff | tree |
2024-02-05 |
Johannes Gäßler | README: updated introduction (#5343) |
commit | commitdiff | tree |
2024-02-05 |
Kawrakow | ggml : make use of ggml-quants.h possible in C++ code... |
commit | commitdiff | tree |
2024-02-05 |
Dr. Tom Murphy... | ggml : avoid duplicating function calls using MIN/MAX... |
commit | commitdiff | tree |
2024-02-05 |
Kawrakow | iq3_xxs: quards for the no-imatrix situation (#5334) |
commit | commitdiff | tree |
2024-02-05 |
Guoteng | py : fix internlm2-hf convert to gguf (#5305) |
commit | commitdiff | tree |
2024-02-05 |
Kawrakow | iq2_xxs: tune quantization (#5320) |
commit | commitdiff | tree |
2024-02-05 |
Alexey Parfenov | server : allow to get default generation settings for... |
commit | commitdiff | tree |
2024-02-05 |
l3utterfly | common : add dynamic temperature parameters to main... |
commit | commitdiff | tree |
2024-02-05 |
Georgi Gerganov | scripts : fix typos, cleanup (#5303) |
commit | commitdiff | tree |
2024-02-05 |
Нияз Гарифзянов | scripts : add non-interactive server-llm.sh (#5303) |
commit | commitdiff | tree |
2024-02-05 |
chiranko | readme : add CodeShell models to the supported models... |
commit | commitdiff | tree |
2024-02-05 |
AidanBeltonS | [SYCL] Fix cpy with dims of 3 (#5289) |
commit | commitdiff | tree |
2024-02-04 |
github-actions... | flake.lock: Update |
commit | commitdiff | tree |
2024-02-04 |
Kawrakow | Adding some imatrix tools (#5302) |
commit | commitdiff | tree |
2024-02-04 |
Welby Seely | cmake : use set() for LLAMA_WIN_VER (#5298) |
commit | commitdiff | tree |
2024-02-03 |
Johannes Gäßler | make: add nvcc info print (#5310) |
commit | commitdiff | tree |
2024-02-03 |
Johannes Gäßler | make: fix nvcc optimization flags for host code (#5309) |
commit | commitdiff | tree |
2024-02-03 |
Martin Schwaighofer | add Vulkan support to Nix flake |
commit | commitdiff | tree |
2024-02-03 |
0cc4m | Vulkan Intel Fixes, Optimizations and Debugging Flags... |
commit | commitdiff | tree |
2024-02-03 |
Michael Klimenko | refactor : switch to emplace_back to avoid extra object... |
commit | commitdiff | tree |
2024-02-03 |
Jared Van Bortel | YaRN : store rope scaling type as int32_t in memory... |
commit | commitdiff | tree |
2024-02-03 |
BADR | readme : add tenere in the ui tools list (#5284) |
commit | commitdiff | tree |
2024-02-03 |
AidanBeltonS | Fix im2col with 32fp (#5286) |
commit | commitdiff | tree |
2024-02-02 |
kalomaze | perplexity : fix KL divergence calculations on Windows... |
commit | commitdiff | tree |
2024-02-02 |
Georgi Gerganov | scripts : parse wtype in server-llm.sh (#5167) |
commit | commitdiff | tree |
2024-02-02 |
Mirror Azure | py : add check for '.attn.masked_bias' layers to GPT2mo... |
commit | commitdiff | tree |
2024-02-02 |
AidanBeltonS | Tidy ggml-sycl (#5261) |
commit | commitdiff | tree |
2024-02-02 |
Xuan Son Nguyen | docker : add build for SYCL, Vulkan + update readme... |
commit | commitdiff | tree |
2024-02-02 |
Meng, Hengyu | [SYCL] get MAX_MEM_ALLOC from device property (#5270) |
commit | commitdiff | tree |
2024-02-02 |
Neo Zhang Jianyu | [SYCL] update guide of SYCL backend (#5254) |
commit | commitdiff | tree |
2024-02-02 |
Ian Bull | llama : fix memory leak in llama_batch_free (#5252) |
commit | commitdiff | tree |
2024-02-01 |
Neo Zhang Jianyu | add --no-mmap in llama-bench (#5257) |
commit | commitdiff | tree |
2024-02-01 |
0cc4m | Vulkan Phi Fix for AMD Proprietary Drivers (#5260) |
commit | commitdiff | tree |
2024-02-01 |
slaren | cuda : fix LLAMA_CUDA_F16 (#5262) |
commit | commitdiff | tree |
2024-02-01 |
Ali Nehzat | make : generate .a library for static linking (#5205) |
commit | commitdiff | tree |
2024-02-01 |
Guoteng | llama : support InternLM2 (#5184) |
commit | commitdiff | tree |
2024-01-31 |
Eve | Fix broken Vulkan Cmake (properly) (#5230) |
commit | commitdiff | tree |
2024-01-31 |
Georgi Gerganov | llama : reorder build_orion() at correct place (#5118) |
commit | commitdiff | tree |
2024-01-31 |
Georgi Gerganov | llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU... |
commit | commitdiff | tree |
2024-01-31 |
Georgi Gerganov | metal : add im2col F32 dst support (#5132) |
commit | commitdiff | tree |
2024-01-31 |
JidongZhang-THU | llava : add MobileVLM support (#5132) |
commit | commitdiff | tree |
2024-01-31 |
Neo Zhang Jianyu | format license text, restore apache license by legal... |
commit | commitdiff | tree |
2024-01-31 |
slaren | ggml : limit n_threads to the max n_tasks (#5238) |
commit | commitdiff | tree |
2024-01-31 |
0cc4m | Vulkan Fixes (#5223) |
commit | commitdiff | tree |
2024-01-31 |
Yiming Cui | Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231) |
commit | commitdiff | tree |
2024-01-31 |
Neo Zhang Jianyu | support SYCL backend windows build (#5208) |
commit | commitdiff | tree |
2024-01-31 |
Jared Van Bortel | kompute : llama-bench support and ggml_cpu_has_kompute... |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | Revert "server : change deps.sh xxd files to string... |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | server : fix context shift (#5195) |
commit | commitdiff | tree |
2024-01-30 |
JohnnyB | server : change deps.sh xxd files to string literals... |
commit | commitdiff | tree |
2024-01-30 |
Kawrakow | ggml : fix IQ3_XXS on Metal (#5219) |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | sync : ggml (#0) |
commit | commitdiff | tree |
2024-01-30 |
Georgi Gerganov | gguf : fix comparison (ggml/715) |
commit | commitdiff | tree |
next |