]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-01-17 Georgi Gerganovbackend : add eval callback (#4935)
2024-01-17 Georgi Gerganovmetal : create autorelease pool during library build...
2024-01-17 Georgi Gerganovpy : fix whitespace
2024-01-17 Georgi Gerganovpy : fix missing added_tokens_dict for SPM and BPE...
2024-01-17 Kawrakowllama : use Q4_K for attn_v for Q2_K_S when n_gqa ...
2024-01-17 Paul Tsochantarismetal : remove unnecessary nil check (#4986)
2024-01-17 David Renshawllama : fix copy/paste error in llama_sampling_params...
2024-01-16 Georgi Gerganovpy : remove unnecessary hasattr (#4903)
2024-01-16 Philip Taronnix: remove nixConfig from flake.nix (#4984)
2024-01-16 Daniel Beveniusfinetune : add training data file to log message (...
2024-01-16 Kawrakowggml : importance matrix support for legacy quants...
2024-01-16 Maximilian... examples : add complete parallel function calling examp...
2024-01-16 Georgi Gerganovperplexity : fix kv cache handling for hellaswag (...
2024-01-16 Georgi Gerganovflake.lock: update flake-parts, flake-parts/nixpkgs...
2024-01-16 Paul Tsochantarismetal : localized logic in `ggml_metal_graph_compute...
2024-01-16 Neuman Vongandroid : introduce starter project example (#4926)
2024-01-16 Alex Azarovmetal : replace loop of dispatch_async with dispatch_ap...
2024-01-16 Alex Azarovmetal : log `recommendedMaxWorkingSetSize` on iOS 16...
2024-01-16 Maximilian... examples : fix and improv docs for the grammar generato...
2024-01-16 Justine Tunneyggml : introduce GGML_CALL function annotation (#4850)
2024-01-16 Daniel Beveniusfinetune : use LLAMA_FILE_MAGIC_GGLA (#4961)
2024-01-16 stduhpfspeculative : threading options (#4959)
2024-01-15 ngc92pass cpu-architecture arguments only to host code ...
2024-01-15 David Friehsllama : apply classifier-free guidance to logits direct...
2024-01-15 Victor Z. Pengawq-py : fix typo in awq-py/README.md (#4947)
2024-01-15 Georgi Gerganovcuda : fix dequantize kernel names (#4938)
2024-01-15 Kawrakowllama : check for 256 divisibility for IQ2_XS, IQ2_XXS...
2024-01-15 KawrakowCUDA: faster dequantize kernels for Q4_0 and Q4_1 ...
2024-01-14 David Pflugllama : fix missing quotes (#4937)
2024-01-14 KawrakowAdd ability to use importance matrix for all k-quants...
2024-01-14 Georgi Gerganovllama : check LLAMA_TRACE env for extra logging (#4929)
2024-01-14 Georgi Gerganovscripts : sync-ggml-am.sh option to skip commits
2024-01-14 Georgi Gerganovllama : use LLAMA_LOG_ macros for logging
2024-01-14 KawrakowFix ffn_down quantization mix for MoE models (#4927)
2024-01-14 Alex Azarovmetal : correctly set SIMD support flags on iOS (#4923)
2024-01-14 Karthik Kumar... llama : support WinXP build with MinGW 8.1.0 (#3419)
2024-01-14 Kawrakow2-bit quantizations (#4897)
2024-01-14 KawrakowMake Q3_K_S be the same as olf Q3_K_L for Mixtral-8x7B...
2024-01-13 Georgi Gerganovsync : ggml
2024-01-13 Johannes Gäßlerggml: cache sin/cos for RoPE (#4908)
2024-01-13 Georgi Gerganovmetal : remove old API (#4919)
2024-01-13 Georgi Gerganovserver : fix prompt caching with system prompt (#4914)
2024-01-13 Georgi Gerganovllama : fix detokenization of non-special added-tokens...
2024-01-13 Georgi Gerganovmetal : disable log for loaded kernels (#4794)
2024-01-13 David Friehsllama : minimize size used for state save/load (#4820)
2024-01-13 Someoneworkflows: unbreak nix-build-aarch64, and split it...
2024-01-13 Yann Folletmain : add parameter --no-display-prompt (#4541)
2024-01-13 texmex76gguf : fix potential infinite for-loop (#4600)
2024-01-13 Georgi Gerganovmetal : refactor kernel loading code (#4794)
2024-01-13 Johannes Gäßlercompare-llama-bench: tweak output format (#4910)
2024-01-13 Ziad Ben Hadj... server : fix deadlock that occurs in multi-prompt scena...
2024-01-13 makomkserver : fix crash with multimodal models without BOS...
2024-01-13 Georgi Gerganovconvert : update phi-2 to latest HF repo (#4903)
2024-01-12 Georgi Gerganovsync : ggml
2024-01-12 Georgi Gerganovggml : fix 32-bit ARM compat for IQ2_XS (whisper/1758)
2024-01-12 slarenbackend_sched : fix assignments
2024-01-12 Maximilian... examples : add pydantic models to GBNF grammar generato...
2024-01-12 Johannes GäßlerCUDA: faster q8_0 -> f16 dequantization (#4895)
2024-01-12 slarenllama : ggml-backend integration (#4766)
2024-01-12 Georgi Gerganovllama : remove redundant assert for StableLM (#4901)
2024-01-12 Daniel Beveniusexport-lora : use LLAMA_FILE_MAGIC_GGLA (#4894)
2024-01-12 Zayllama.swiftui : update models layout (#4826)
2024-01-12 Georgi Gerganovgitignore : imatrix
2024-01-12 Johannes GäßlerCUDA: fix softmax compile for old CUDA versions (#4862)
2024-01-12 Georgi Gerganovllama : fix typo "imp_embd" -> "inp_embd"
2024-01-12 howlgercommon : streamline the formatting of help (#4890)
2024-01-12 Georgi Gerganovpy : fix lint (#4889)
2024-01-12 Georgi Gerganovllama : fix llm_build_k_shift to use correct n_rot...
2024-01-12 KawrakowImportance Matrix calculation (#4861)
2024-01-11 Georgi Gerganovserver : fix infill when prompt is empty (#4833)
2024-01-11 Georgi Gerganovmain : better name for variable n_print (#4874)
2024-01-11 Georgi Gerganovmain : disable token count by default (#4874)
2024-01-11 Georgi Gerganovswift : track ggml release branch (#4867)
2024-01-11 Kawrakowllama : restore intended k-quants mixes for MoE models...
2024-01-11 Kawrakowggml : SOTA 2-bit quants (add IQ2_XS) (#4856)
2024-01-11 Georgi Gerganovswift : pin ggml commit + remove ggml.h from spm-header...
2024-01-11 Lauraserver : implement credentialed CORS (#4514)
2024-01-11 Michael Coppolaserver : support for multiple api keys (#4864)
2024-01-11 Behnam Mserver : add `LOG_INFO` when model is successfully...
2024-01-11 Someoneci: nix-flake-update: new token with pr permissions...
2024-01-11 pudepiedjmain : print total token count and tokens consumed...
2024-01-11 Isaac McFadyenserver : fix typo in model name (#4876)
2024-01-11 Paul Tsochantarismetal : put encoder debug group behind a define (#4873)
2024-01-11 Georgi Gerganovsync : ggml
2024-01-11 Georgi Gerganovmetal : fix deprecation warning (ggml/690)
2024-01-11 Timothy Croninggml : remove ggml_cpy_inplace and ggml_cont_inplace...
2024-01-11 Jack Mousseaumetal : wrap each operation in debug group (ggml/690)
2024-01-11 leejetggml : change GGML_MAX_NAME at compile time (ggml/682)
2024-01-11 Halalaluyafail3Fix execlp call (ggml/689)
2024-01-11 Erik Scholzfix : cuda order of synchronization when setting a...
2024-01-11 Behnam Mserver : update readme to document the new `/health...
2024-01-11 Georgi Gerganovserver : fix build + rename enums (#4870)
2024-01-10 Behnam Mserver : add a `/health` endpoint (#4860)
2024-01-10 Brianllama : add additional suffixes for model params (...
2024-01-10 Austinllama : recognize 1B phi models (#4847)
2024-01-10 Johnclip : support more quantization types (#4846)
2024-01-10 Johannes GäßlerPython script to compare commits with llama-bench ...
2024-01-09 Austinconvert.py : fix vanilla LLaMA model conversion (#4818)
2024-01-09 Justine Tunneyllava-cli : don't crash if --image flag is invalid...
2024-01-09 Georgi Gerganovmetal : improve dequantize precision to match CPU ...
next