2024-06-14 |
Radoslav Gerganov | llama-bench : fix RPC indication (#7936) |
commit | commitdiff | tree |
2024-06-14 |
Sigbjørn Skjæret | llama : more checks before assuming FIM tokens (#7644) |
commit | commitdiff | tree |
2024-06-14 |
Elaine | convert : add Poro-34B-chat tokenizer support (#7713) |
commit | commitdiff | tree |
2024-06-13 |
Radoslav Gerganov | rpc : fix ggml_backend_rpc_supports_buft() (#7918) |
commit | commitdiff | tree |
2024-06-13 |
Galunid | readme : Remove outdated instructions from README.md... |
commit | commitdiff | tree |
2024-06-13 |
slaren | move BLAS to a separate backend (#6210) |
commit | commitdiff | tree |
2024-06-12 |
Olivier Chafik | `build`: rename main → llama-cli, server → llama-server... |
commit | commitdiff | tree |
2024-06-12 |
Johannes Gäßler | CUDA: fix broken oob check for FA vec f32 kernel (... |
commit | commitdiff | tree |
2024-06-12 |
Georgi Gerganov | tests : add non-cont unary tests (#7857) |
commit | commitdiff | tree |
2024-06-12 |
Georgi Gerganov | ggml : improve ggml_is_contiguous logic (#7856) |
commit | commitdiff | tree |
2024-06-12 |
Georgi Gerganov | server : restore numeric prompts (#7883) |
commit | commitdiff | tree |
2024-06-12 |
Meng, Hengyu | update intel docker oneapi-basekit to 2024.1.1-devel... |
commit | commitdiff | tree |
2024-06-12 |
Patrice Ferlet | Fix a typo and add Fedora 40 pacakge to install for... |
commit | commitdiff | tree |
2024-06-11 |
k.h.lai | vulkan: select only one device for single gpu with... |
commit | commitdiff | tree |
2024-06-11 |
0cc4m | Update Vulkan RoPE implementation (#7818) |
commit | commitdiff | tree |
2024-06-11 |
Deven Mistry | fix broken link in pr template (#7880) [no ci] |
commit | commitdiff | tree |
2024-06-11 |
Brian | github: move PR template to .github/ root (#7868) |
commit | commitdiff | tree |
2024-06-11 |
Johannes Gäßler | llama-bench: more compact markdown tables (#7879) |
commit | commitdiff | tree |
2024-06-11 |
Georgi Gerganov | tests : check the Python version (#7872) |
commit | commitdiff | tree |
2024-06-11 |
Johannes Gäßler | CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)... |
commit | commitdiff | tree |
2024-06-11 |
slaren | fix CUDA CI by using a windows-2019 image (#7861) |
commit | commitdiff | tree |
2024-06-11 |
Olivier Chafik | json: refine constraint for whitespace to avoid runaway... |
commit | commitdiff | tree |
2024-06-11 |
Olivier Chafik | `json`: document schema conversion in GBNF readme,... |
commit | commitdiff | tree |
2024-06-10 |
Jared Van Bortel | cmake : fix CMake requirement for CUDA (#7821) |
commit | commitdiff | tree |
2024-06-10 |
slaren | ci : try win-2019 on server windows test (#7854) |
commit | commitdiff | tree |
2024-06-10 |
Georgi Gerganov | examples : remove --instruct remnants (#7846) |
commit | commitdiff | tree |
2024-06-10 |
Georgi Gerganov | server : improve "prompt" handling (#7847) |
commit | commitdiff | tree |
2024-06-10 |
Johannes Gäßler | CUDA: use tensor cores for MMQ (#7676) |
commit | commitdiff | tree |
2024-06-10 |
Ben Ashbaugh | use the correct SYCL context for host USM allocations... |
commit | commitdiff | tree |
2024-06-09 |
Georgi Gerganov | flake.lock: Update (#7838) |
commit | commitdiff | tree |
2024-06-09 |
Georgi Gerganov | imatrix : handle partial entries (#7833) |
commit | commitdiff | tree |
2024-06-09 |
Nicolás Pérez | docs: Added initial PR template with directions for... |
commit | commitdiff | tree |
2024-06-09 |
mgroeber9110 | server: do not remove whitespace at the start of a... |
commit | commitdiff | tree |
2024-06-09 |
Johannes Gäßler | CUDA: revise q8_1 data layout for mul_mat_q (#7824) |
commit | commitdiff | tree |
2024-06-09 |
sasha0552 | convert-hf : set the model name based on cli arg, if... |
commit | commitdiff | tree |
2024-06-09 |
compilade | convert-hf : match model part name prefix and suffix... |
commit | commitdiff | tree |
2024-06-09 |
compilade | gguf-py : decouple adding metadata from writing in... |
commit | commitdiff | tree |
2024-06-08 |
slaren | Revert "[SYCL] Update rpc-server.cpp to include SYCL... |
commit | commitdiff | tree |
2024-06-08 |
Olivier Chafik | url: save -mu downloads to new cache location (#7826) |
commit | commitdiff | tree |
2024-06-08 |
sasha0552 | server : smart slot selection using Longest Common... |
commit | commitdiff | tree |
2024-06-07 |
slaren | vulkan : reuse parent extra for views (#7806) |
commit | commitdiff | tree |
2024-06-07 |
Christian Zhou... | gguf-split : change binary multi-byte units to decimal... |
commit | commitdiff | tree |
2024-06-07 |
intelmatt | cmake : fix BUILD_SHARED_LIBS=ON build (#7784) |
commit | commitdiff | tree |
2024-06-07 |
Johannes Gäßler | server: update cache_prompt documentation [no ci] ... |
commit | commitdiff | tree |
2024-06-07 |
woodx | server : do not get prompt in infill mode (#7286) |
commit | commitdiff | tree |
2024-06-07 |
pengxin99 | [SYCL] fix softmax r2r result wrong issue (#7811) |
commit | commitdiff | tree |
2024-06-07 |
slaren | check for nans in imatrix and quantize (#7807) |
commit | commitdiff | tree |
2024-06-06 |
Georgi Gerganov | server : fix --threads-http arg (#7801) |
commit | commitdiff | tree |
2024-06-06 |
Georgi Gerganov | imatrix : migrate to gpt_params (#7771) |
commit | commitdiff | tree |
2024-06-06 |
Clint Herron | Added support for . (any character) token in grammar... |
commit | commitdiff | tree |
2024-06-06 |
Mattheus Chediak | README minor fixes (#7798) [no ci] |
commit | commitdiff | tree |
2024-06-06 |
Olivier Chafik | grammars: x{min,max} repetition operator (#6640) |
commit | commitdiff | tree |
2024-06-06 |
Joan Fontanals | llama : add jina v2 base code (#7596) |
commit | commitdiff | tree |
2024-06-06 |
slaren | docker : build only main and server in their images... |
commit | commitdiff | tree |
2024-06-06 |
slaren | docker : add openmp lib (#7780) |
commit | commitdiff | tree |
2024-06-05 |
Galunid | Fix encoding in python scripts (#7733) |
commit | commitdiff | tree |
2024-06-05 |
Johannes Gäßler | CUDA: refactor mmq, dmmv, mmvq (#7716) |
commit | commitdiff | tree |
2024-06-05 |
Georgi Gerganov | ggml : refactor rope norm/neox (#7634) |
commit | commitdiff | tree |
2024-06-05 |
arch-btw | readme : remove -ins (#7759) |
commit | commitdiff | tree |
2024-06-04 |
jaime-m-p | Fix per token atrributes bits (#7749) |
commit | commitdiff | tree |
2024-06-04 |
agray3 | Allow number of nodes in CUDA graph to change (#7738) |
commit | commitdiff | tree |
2024-06-04 |
Georgi Gerganov | common : refactor cli arg parsing (#7675) |
commit | commitdiff | tree |
2024-06-04 |
Georgi Gerganov | ggml : remove OpenCL (#7735) |
commit | commitdiff | tree |
2024-06-04 |
Georgi Gerganov | llama : remove beam search (#7736) |
commit | commitdiff | tree |
2024-06-04 |
Georgi Gerganov | readme : remove obsolete Zig instructions (#7471) |
commit | commitdiff | tree |
2024-06-04 |
slaren | llama-bench : allow using a different printer for stder... |
commit | commitdiff | tree |
2024-06-04 |
Daniele | Improve hipBLAS support in CMake (#7696) |
commit | commitdiff | tree |
2024-06-04 |
zhouwg | refine .gitignore (#7688) |
commit | commitdiff | tree |
2024-06-04 |
jaime-m-p | Per token attributes (#7685) |
commit | commitdiff | tree |
2024-06-04 |
Georgi Gerganov | ggml : prevent builds with -ffinite-math-only (#7726) |
commit | commitdiff | tree |
2024-06-03 |
Radoslav Gerganov | llama : offload to RPC in addition to other backends... |
commit | commitdiff | tree |
2024-06-03 |
Masaya, Kato | ggml : use OpenMP as a thread pool (#7606) |
commit | commitdiff | tree |
2024-06-03 |
Johannes Gäßler | make: fix debug options not being applied to NVCC ... |
commit | commitdiff | tree |
2024-06-03 |
0cc4m | Vulkan Mixture of Experts (MoE) support (#7628) |
commit | commitdiff | tree |
2024-06-03 |
Andy Tai | cmake : add pkg-config spec file for llama.cpp (#7702) |
commit | commitdiff | tree |
2024-06-03 |
zhangkaihuo | llama : MiniCPM support tied embeddings (#7664) |
commit | commitdiff | tree |
2024-06-03 |
Georgi Gerganov | llama : avoid double token-to-piece cache (#7654) |
commit | commitdiff | tree |
2024-06-03 |
woachk | kompute : implement op_getrows_f32 (#6403) |
commit | commitdiff | tree |
2024-06-02 |
Dave Airlie | fix bug introduced in using calloc (#7701) |
commit | commitdiff | tree |
2024-06-02 |
Georgi Gerganov | flake.lock: Update (#7686) |
commit | commitdiff | tree |
2024-06-02 |
Austin | chore : add ignore rule for generated server themes... |
commit | commitdiff | tree |
2024-06-02 |
nickp27 | [SYCL] Update rpc-server.cpp to include SYCL backend... |
commit | commitdiff | tree |
2024-06-01 |
Johannes Gäßler | Fix FlashAttention debug test, FP32 assert (#7684) |
commit | commitdiff | tree |
2024-06-01 |
Yazan Agha... | server : new UI (#7633) |
commit | commitdiff | tree |
2024-06-01 |
HanishKVC | SimpleChat: Simple histogram/repeatMatching driven... |
commit | commitdiff | tree |
2024-06-01 |
Johannes Gäßler | CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8... |
commit | commitdiff | tree |
2024-06-01 |
Johannes Gäßler | CUDA: quantized KV support for FA vec (#7527) |
commit | commitdiff | tree |
2024-05-31 |
Georgi Gerganov | server : update js (#7670) |
commit | commitdiff | tree |
2024-05-31 |
Galunid | convert-hf : Handle NotImplementedError in convert... |
commit | commitdiff | tree |
2024-05-31 |
Johannes Gäßler | scripts: update compare_llama_bench.py [no ci] (#7673) |
commit | commitdiff | tree |
2024-05-31 |
Daniele | Improve HIP compatibility (#7672) |
commit | commitdiff | tree |
2024-05-31 |
Georgi Gerganov | readme : link homebrew discussion |
commit | commitdiff | tree |
2024-05-31 |
Georgi Gerganov | ggml : fix loongson compile warnings (#7537) |
commit | commitdiff | tree |
2024-05-31 |
Galunid | Somehow '**' got lost (#7663) |
commit | commitdiff | tree |
2024-05-31 |
Galunid | Add convert.py removal to hot topics (#7662) |
commit | commitdiff | tree |
2024-05-30 |
Sertaç Özercan | [no ci] docs: add aikit to readme (#7650) |
commit | commitdiff | tree |
2024-05-30 |
JohnnyB | Fixed painfully slow single process builds. (#7326) |
commit | commitdiff | tree |
2024-05-30 |
Georgi Gerganov | llama : cache llama_token_to_piece (#7587) |
commit | commitdiff | tree |
2024-05-30 |
Martin Delille | Fix conan badge display [no ci] (#7645) |
commit | commitdiff | tree |
2024-05-30 |
Manuel | Add brew installation instruction to README [no ci... |
commit | commitdiff | tree |
next |