]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-01-15 KawrakowCUDA: faster dequantize kernels for Q4_0 and Q4_1 ...
2024-01-14 David Pflugllama : fix missing quotes (#4937)
2024-01-14 KawrakowAdd ability to use importance matrix for all k-quants...
2024-01-14 Georgi Gerganovllama : check LLAMA_TRACE env for extra logging (#4929)
2024-01-14 Georgi Gerganovscripts : sync-ggml-am.sh option to skip commits
2024-01-14 Georgi Gerganovllama : use LLAMA_LOG_ macros for logging
2024-01-14 KawrakowFix ffn_down quantization mix for MoE models (#4927)
2024-01-14 Alex Azarovmetal : correctly set SIMD support flags on iOS (#4923)
2024-01-14 Karthik Kumar... llama : support WinXP build with MinGW 8.1.0 (#3419)
2024-01-14 Kawrakow2-bit quantizations (#4897)
2024-01-14 KawrakowMake Q3_K_S be the same as olf Q3_K_L for Mixtral-8x7B...
2024-01-13 Georgi Gerganovsync : ggml
2024-01-13 Johannes Gäßlerggml: cache sin/cos for RoPE (#4908)
2024-01-13 Georgi Gerganovmetal : remove old API (#4919)
2024-01-13 Georgi Gerganovserver : fix prompt caching with system prompt (#4914)
2024-01-13 Georgi Gerganovllama : fix detokenization of non-special added-tokens...
2024-01-13 Georgi Gerganovmetal : disable log for loaded kernels (#4794)
2024-01-13 David Friehsllama : minimize size used for state save/load (#4820)
2024-01-13 Someoneworkflows: unbreak nix-build-aarch64, and split it...
2024-01-13 Yann Folletmain : add parameter --no-display-prompt (#4541)
2024-01-13 texmex76gguf : fix potential infinite for-loop (#4600)
2024-01-13 Georgi Gerganovmetal : refactor kernel loading code (#4794)
2024-01-13 Johannes Gäßlercompare-llama-bench: tweak output format (#4910)
2024-01-13 Ziad Ben Hadj... server : fix deadlock that occurs in multi-prompt scena...
2024-01-13 makomkserver : fix crash with multimodal models without BOS...
2024-01-13 Georgi Gerganovconvert : update phi-2 to latest HF repo (#4903)
2024-01-12 Georgi Gerganovsync : ggml
2024-01-12 Georgi Gerganovggml : fix 32-bit ARM compat for IQ2_XS (whisper/1758)
2024-01-12 slarenbackend_sched : fix assignments
2024-01-12 Maximilian... examples : add pydantic models to GBNF grammar generato...
2024-01-12 Johannes GäßlerCUDA: faster q8_0 -> f16 dequantization (#4895)
2024-01-12 slarenllama : ggml-backend integration (#4766)
2024-01-12 Georgi Gerganovllama : remove redundant assert for StableLM (#4901)
2024-01-12 Daniel Beveniusexport-lora : use LLAMA_FILE_MAGIC_GGLA (#4894)
2024-01-12 Zayllama.swiftui : update models layout (#4826)
2024-01-12 Georgi Gerganovgitignore : imatrix
2024-01-12 Johannes GäßlerCUDA: fix softmax compile for old CUDA versions (#4862)
2024-01-12 Georgi Gerganovllama : fix typo "imp_embd" -> "inp_embd"
2024-01-12 howlgercommon : streamline the formatting of help (#4890)
2024-01-12 Georgi Gerganovpy : fix lint (#4889)
2024-01-12 Georgi Gerganovllama : fix llm_build_k_shift to use correct n_rot...
2024-01-12 KawrakowImportance Matrix calculation (#4861)
2024-01-11 Georgi Gerganovserver : fix infill when prompt is empty (#4833)
2024-01-11 Georgi Gerganovmain : better name for variable n_print (#4874)
2024-01-11 Georgi Gerganovmain : disable token count by default (#4874)
2024-01-11 Georgi Gerganovswift : track ggml release branch (#4867)
2024-01-11 Kawrakowllama : restore intended k-quants mixes for MoE models...
2024-01-11 Kawrakowggml : SOTA 2-bit quants (add IQ2_XS) (#4856)
2024-01-11 Georgi Gerganovswift : pin ggml commit + remove ggml.h from spm-header...
2024-01-11 Lauraserver : implement credentialed CORS (#4514)
2024-01-11 Michael Coppolaserver : support for multiple api keys (#4864)
2024-01-11 Behnam Mserver : add `LOG_INFO` when model is successfully...
2024-01-11 Someoneci: nix-flake-update: new token with pr permissions...
2024-01-11 pudepiedjmain : print total token count and tokens consumed...
2024-01-11 Isaac McFadyenserver : fix typo in model name (#4876)
2024-01-11 Paul Tsochantarismetal : put encoder debug group behind a define (#4873)
2024-01-11 Georgi Gerganovsync : ggml
2024-01-11 Georgi Gerganovmetal : fix deprecation warning (ggml/690)
2024-01-11 Timothy Croninggml : remove ggml_cpy_inplace and ggml_cont_inplace...
2024-01-11 Jack Mousseaumetal : wrap each operation in debug group (ggml/690)
2024-01-11 leejetggml : change GGML_MAX_NAME at compile time (ggml/682)
2024-01-11 Halalaluyafail3Fix execlp call (ggml/689)
2024-01-11 Erik Scholzfix : cuda order of synchronization when setting a...
2024-01-11 Behnam Mserver : update readme to document the new `/health...
2024-01-11 Georgi Gerganovserver : fix build + rename enums (#4870)
2024-01-10 Behnam Mserver : add a `/health` endpoint (#4860)
2024-01-10 Brianllama : add additional suffixes for model params (...
2024-01-10 Austinllama : recognize 1B phi models (#4847)
2024-01-10 Johnclip : support more quantization types (#4846)
2024-01-10 Johannes GäßlerPython script to compare commits with llama-bench ...
2024-01-09 Austinconvert.py : fix vanilla LLaMA model conversion (#4818)
2024-01-09 Justine Tunneyllava-cli : don't crash if --image flag is invalid...
2024-01-09 Georgi Gerganovmetal : improve dequantize precision to match CPU ...
2024-01-09 Georgi Gerganovscripts : improve get-pg.sh (#4838)
2024-01-09 iohubreadme : add 3rd party collama reference to UI list...
2024-01-09 Georgi Gerganovscripts : script to get Paul Graham essays in txt forma...
2024-01-09 Behnam Mserver : update readme about token probs (#4777)
2024-01-09 Zsapiserver : add api-key flag to documentation (#4832)
2024-01-09 Georgi Gerganovggml : fix vld1q_s8_x4 32-bit compat (#4828)
2024-01-09 Johannes GäßlerCUDA: faster softmax via shared memory + fp16 math...
2024-01-08 howlgercommon : fix the short form of `--grp-attn-w`, not...
2024-01-08 Georgi Gerganovreadme : add link to SOTA models
2024-01-08 KawrakowSOTA 2-bit quants (#4773)
2024-01-08 Georgi Gerganovswift : exclude ggml-metal.metal from the package ...
2024-01-08 Georgi Gerganovllama.swiftui : update readme
2024-01-08 Georgi Gerganovmain : add self-extend support (#4815)
2024-01-08 Georgi Gerganovexamples : add passkey test (#3856)
2024-01-07 Lars Grammelreadme : add lgrammel/modelfusion JS/TS client for...
2024-01-07 slarenllama-bench : add no-kv-offload parameter (#4812)
2024-01-07 Johannes GäßlerCUDA: fixed redundant value dequantization (#4809)
2024-01-07 Georgi Gerganovllama : remove unused vars (#4796)
2024-01-07 Georgi Gerganovllama : remove redundant GQA check (#4796)
2024-01-07 Alex Azarovllama.swiftui : use llama.cpp as SPM package (#4804)
2024-01-07 Georgi Gerganovllama : print tensor meta for debugging
2024-01-07 Alex Azarovllama.swiftui : add visionOS target (#4805)
2024-01-07 Konstantin... ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11...
2024-01-07 Georgi Gerganovserver : fix n_predict check (#4798)
2024-01-06 Daniel Illescas... llama.swiftui : use correct pointer for llama_token_eos...
2024-01-06 Georgi Gerganovexamples : improve base-translate.sh script (#4783)
2024-01-05 a-n-n-a-l-e-ecmake : check for openblas64 (#4134)
next