2024-07-02 |
slaren | cuda : update supports_op for matrix multiplication... |
commit | commitdiff | tree |
2024-07-02 |
luoyu-intel | [SYCL] Fix win build conflict of math library (#8230) |
commit | commitdiff | tree |
2024-07-02 |
luoyu-intel | [SYCL] Fix the sub group size of Intel (#8106) |
commit | commitdiff | tree |
2024-07-01 |
Xuan Son Nguyen | Fix gemma2 tokenizer convert (#8244) |
commit | commitdiff | tree |
2024-07-01 |
Johannes Gäßler | CUDA: refactor and optimize IQ MMVQ (#8215) |
commit | commitdiff | tree |
2024-07-01 |
Mateusz Charytoniuk | readme: add Paddler to the list of projects (#8239) |
commit | commitdiff | tree |
2024-07-01 |
Xuan Son Nguyen | gemma2: add sliding window mask (#8227) |
commit | commitdiff | tree |
2024-07-01 |
Roni | readme : update tool list (#8209) |
commit | commitdiff | tree |
2024-07-01 |
Michael Francis | nix : enable curl (#8043) |
commit | commitdiff | tree |
2024-07-01 |
Georgi Gerganov | nix : remove OpenCL remnants (#8235) |
commit | commitdiff | tree |
2024-07-01 |
iacore | Document BERT support. (#8205) |
commit | commitdiff | tree |
2024-07-01 |
zhentaoyu | [SYCL] Update SYCL-Rope op and Refactor (#8157) |
commit | commitdiff | tree |
2024-06-30 |
Georgi Gerganov | flake.lock: Update (#8218) |
commit | commitdiff | tree |
2024-06-30 |
Xuan Son Nguyen | Fix new line issue with chat template, disable template... |
commit | commitdiff | tree |
2024-06-30 |
Andrei | llama: Add attention and final logit soft-capping,... |
commit | commitdiff | tree |
2024-06-28 |
Xuan Son Nguyen | fix code typo in llama-cli (#8198) |
commit | commitdiff | tree |
2024-06-28 |
Olivier Chafik | json: attempt to skip slow tests when running under... |
commit | commitdiff | tree |
2024-06-28 |
Xuan Son Nguyen | Add MiniCPM, Deepseek V2 chat template + clean up ... |
commit | commitdiff | tree |
2024-06-28 |
Sigbjørn Skjæret | Add SPM infill support (#8016) |
commit | commitdiff | tree |
2024-06-28 |
slaren | cmake : allow user to override default options (#8178) |
commit | commitdiff | tree |
2024-06-28 |
Olivier Chafik | `json`: restore default additionalProperties to false... |
commit | commitdiff | tree |
2024-06-28 |
pculliton | llama: Add support for Gemma2ForCausalLM (#8156) |
commit | commitdiff | tree |
2024-06-28 |
Xuan Son Nguyen | Add missing items in makefile (#8177) |
commit | commitdiff | tree |
2024-06-27 |
Olivier Chafik | `json`: update grammars/README w/ examples & note about... |
commit | commitdiff | tree |
2024-06-27 |
loonerin | CI: fix release build (Ubuntu+Mac) (#8170) |
commit | commitdiff | tree |
2024-06-27 |
slaren | cmake : fix deprecated option names not working (#8171) |
commit | commitdiff | tree |
2024-06-27 |
Xuan Son Nguyen | Add chatml fallback for cpp `llama_chat_apply_template... |
commit | commitdiff | tree |
2024-06-27 |
Georgi Gerganov | flake.lock: Update (#8071) |
commit | commitdiff | tree |
2024-06-27 |
jukofyork | Control vector loading fixes (#8137) |
commit | commitdiff | tree |
2024-06-27 |
Raj Hammeer... | Delete examples/llama.android/llama/CMakeLists.txt... |
commit | commitdiff | tree |
2024-06-27 |
Sigbjørn Skjæret | Add Qwen2MoE 57B-A14B model identifier (#8158) |
commit | commitdiff | tree |
2024-06-27 |
Johannes Gäßler | CUDA: fix MMQ stream-k for --split-mode row (#8167) |
commit | commitdiff | tree |
2024-06-27 |
kustaaya | Added support for Viking pre-tokenizer (#8135) |
commit | commitdiff | tree |
2024-06-27 |
Sigbjørn Skjæret | llama : fix CodeLlama FIM token checks (#8144) |
commit | commitdiff | tree |
2024-06-27 |
Raj Hammeer... | Fix llama-android.cpp for error - "common/common.h... |
commit | commitdiff | tree |
2024-06-26 |
Daniel Bevenius | clip : suppress unused variable warnings (#8105) |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | scripts : fix filename sync |
commit | commitdiff | tree |
2024-06-26 |
slaren | ci : publish new docker images only when the files... |
commit | commitdiff | tree |
2024-06-26 |
slaren | ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CU... |
commit | commitdiff | tree |
2024-06-26 |
slaren | make : fix missing -O3 (#8143) |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | authors : regen |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | devops : remove clblast + LLAMA_CUDA -> GGML_CUDA ... |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | readme : update API notes |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | llama : reorganize source code + improve CMake (#8006) |
commit | commitdiff | tree |
2024-06-26 |
Isaac McFadyen | Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ... |
commit | commitdiff | tree |
2024-06-26 |
Johannes Gäßler | CUDA: fix misaligned shared memory read (#8123) |
commit | commitdiff | tree |
2024-06-26 |
Eddie-Wang | llama : extend llm_build_ffn() to support _scale tensor... |
commit | commitdiff | tree |
2024-06-26 |
Olivier Chafik | `json`: better support for "type" unions (e.g. nullable... |
commit | commitdiff | tree |
2024-06-26 |
Olivier Chafik | `json`: fix additionalProperties, allow space after... |
commit | commitdiff | tree |
2024-06-25 |
jukofyork | fixes #7999 (adds control vectors to all `build_XXX... |
commit | commitdiff | tree |
2024-06-25 |
fairydreaming | llama : implement Unigram tokenizer needed by T5 and... |
commit | commitdiff | tree |
2024-06-25 |
Daniel Bevenius | llama : return nullptr from llama_grammar_init (#8093) |
commit | commitdiff | tree |
2024-06-25 |
Olivier Chafik | `json`: support integer minimum, maximum, exclusiveMini... |
commit | commitdiff | tree |
2024-06-25 |
slaren | disable docker CI on pull requests (#8110) |
commit | commitdiff | tree |
2024-06-25 |
joecryptotoo | Add healthchecks to llama-server containers (#8081) |
commit | commitdiff | tree |
2024-06-25 |
Brian | Gguf dump start data offset via --data-offset and some... |
commit | commitdiff | tree |
2024-06-25 |
Xuan Son Nguyen | cvector: better prompt handling, add "mean vector"... |
commit | commitdiff | tree |
2024-06-25 |
Xuan Son Nguyen | Add chat template support for llama-cli (#8068) |
commit | commitdiff | tree |
2024-06-25 |
HanishKVC | SimpleChat v3.1: Boolean chat request options in Settin... |
commit | commitdiff | tree |
2024-06-25 |
HatsuneMikuUwU33 | Update control vector help (#8104) |
commit | commitdiff | tree |
2024-06-25 |
Meng, Hengyu | [SYCL] Re-enabled mul_mat_batched_sycl (#8095) |
commit | commitdiff | tree |
2024-06-24 |
Johannes Gäßler | CUDA: fix matrix multiplication algorithm choice (... |
commit | commitdiff | tree |
2024-06-24 |
Johannes Gäßler | CUDA: fix MMQ writeback for int8 tensor cores (#8100) |
commit | commitdiff | tree |
2024-06-24 |
Johannes Gäßler | CUDA: use MMQ instead of cuBLAS by default (#8075) |
commit | commitdiff | tree |
2024-06-24 |
fairydreaming | gguf-py : fix tensor groups for encoder-decoder models... |
commit | commitdiff | tree |
2024-06-24 |
Johannes Gäßler | CUDA: optimize MMQ int8 tensor core performance (#8062) |
commit | commitdiff | tree |
2024-06-24 |
Christian Zhou... | Option to split during conversion (#6942) |
commit | commitdiff | tree |
2024-06-24 |
slaren | disable publishing the full-rocm docker image (#8083) |
commit | commitdiff | tree |
2024-06-24 |
Yann Follet | embedding : more cli arguments (#7458) |
commit | commitdiff | tree |
2024-06-24 |
fairydreaming | gguf-py, convert-hf : model conversion support for... |
commit | commitdiff | tree |
2024-06-24 |
slaren | ggml : remove ggml_task_type and GGML_PERF (#8017) |
commit | commitdiff | tree |
2024-06-23 |
Eddie-Wang | llama : add support for BitnetForCausalLM (#7931) |
commit | commitdiff | tree |
2024-06-23 |
Aarni Koskela | server : fix JSON-Scheme typo (#7975) |
commit | commitdiff | tree |
2024-06-23 |
Daniel Bevenius | Fix typo in llama_set_embeddings comment (#8077) |
commit | commitdiff | tree |
2024-06-23 |
slaren | fix CI failures (#8066) |
commit | commitdiff | tree |
2024-06-23 |
0cc4m | Refactor Vulkan backend to allow multiple contexts... |
commit | commitdiff | tree |
2024-06-22 |
Clint Herron | Removing extra blank lines that were breaking Lint... |
commit | commitdiff | tree |
2024-06-22 |
Xuan Son Nguyen | cvector: fix CI + correct help message (#8064) |
commit | commitdiff | tree |
2024-06-22 |
HatsuneMikuUwU33 | cvector-generator: Moe Moe Fixie-Fixie for Lots of... |
commit | commitdiff | tree |
2024-06-22 |
0xspringtime | convert-hf : change assert to exception (#8015) |
commit | commitdiff | tree |
2024-06-22 |
ddh0 | Update llama-quantize ppl/file size output from LLaMA... |
commit | commitdiff | tree |
2024-06-22 |
Clint Herron | JSON Schema to GBNF integration tests (#7790) |
commit | commitdiff | tree |
2024-06-21 |
k.h.lai | vulkan: detect multiple devices by deviceUUID instead... |
commit | commitdiff | tree |
2024-06-21 |
Eve | ggml : AVX IQ quants (#7845) |
commit | commitdiff | tree |
2024-06-21 |
Georgi Gerganov | llama : optimize long word tokenization with WPM (... |
commit | commitdiff | tree |
2024-06-21 |
Douglas Hanley | llama : allow pooled embeddings on any model (#7477) |
commit | commitdiff | tree |
2024-06-21 |
Shuichi Tsutsumi | swiftui : enable stream updating (#7754) |
commit | commitdiff | tree |
2024-06-20 |
Hamdoud Hakem | requirements : Bump torch and numpy for python3.12... |
commit | commitdiff | tree |
2024-06-20 |
Hamdoud Hakem | convert-hf : Fix the encoding in the convert-hf-to... |
commit | commitdiff | tree |
2024-06-20 |
Johannes Gäßler | common: fix warning (#8036) |
commit | commitdiff | tree |
2024-06-20 |
luoyu-intel | [SYCL] Fix windows build and inference (#8003) |
commit | commitdiff | tree |
2024-06-20 |
Johannes Gäßler | CUDA: stream-k decomposition for MMQ (#8018) |
commit | commitdiff | tree |
2024-06-20 |
Michael de... | metal : fix `ggml_metal_supports_op` for BF16 (#8021) |
commit | commitdiff | tree |
2024-06-19 |
sasha0552 | server : fix smart slot selection (#8020) |
commit | commitdiff | tree |
2024-06-19 |
Michael de... | un-ignore `build-info.cmake` and `build-info.sh` (... |
commit | commitdiff | tree |
2024-06-19 |
slaren | ggml : synchronize threads using barriers (#7993) |
commit | commitdiff | tree |
2024-06-19 |
Georgi Gerganov | codecov : remove (#8004) |
commit | commitdiff | tree |
2024-06-19 |
Meng, Hengyu | [SYCL] refactor (#6408) |
commit | commitdiff | tree |
2024-06-18 |
jaime-m-p | tokenizer : BPE fixes (#7530) |
commit | commitdiff | tree |
next |