]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-01-20 slarenllama : run all KQV ops on the CPU with no KV offload...
2024-01-20 Herman Semenovcmake : add support for ccache (#5002)
2024-01-20 adel boussakenAdd a dart/flutter binding to README.md (#4882)
2024-01-20 Kylincuda : fix compile error in jetson platform (#4975)
2024-01-19 Uzo Nwekefinetune : fix ggml_allocr lifetimes (tmp workaround...
2024-01-19 Georgi Gerganovimatrix : add README.md
2024-01-19 Shijiellama : support upcoming Qwen2 (#5037)
2024-01-19 Georgi Gerganovpy : fix flake8 lint
2024-01-19 Kawrakowwinogrande: evaluate log-probs in parallel (#5036)
2024-01-19 chirankollama : add CodeShell support (#5016)
2024-01-19 Kawrakowperplexity: avoid unnecessary alloocations and logit...
2024-01-19 Georgi Gerganovperplexity : faster Winogrande via batching (#5024)
2024-01-18 Johnllama : fix falcon arch for tied output embeddings...
2024-01-18 Georgi Gerganovcmake : add ggml public headers (#5011)
2024-01-18 Xuan Son Nguyenserver : defer tasks when "slot unavailable" (#5018)
2024-01-18 slarenllama : fix mlock with no-mmap with Metal (#5025)
2024-01-18 Georgi Gerganovimatrix : fix assert for src0 non-cont check
2024-01-18 Georgi Gerganovperplexity : fix winogrande N tasks option
2024-01-18 Georgi Gerganovscripts : add get-winogrande.sh
2024-01-18 David Sommersconvert.py : fix llama/llama2 conversion due to vocab_s...
2024-01-18 KawrakowHellaSwag: speed up by parallelizing log-prob evaluatio...
2024-01-18 Georgi Gerganovperplexity : faster HellaSwag via batching (#5017)
2024-01-18 KawrakowAdd Winogrande evaluation (#5015)
2024-01-18 Georgi Gerganovscritps : add helper script to get hellaswag data in...
2024-01-18 Paul Tsochantarismetal : fix memory leak, dangling pointer and unused...
2024-01-17 Georgi Gerganovsync : ggml
2024-01-17 Georgi Gerganovggml : add IQ2 to test-backend-ops + refactoring (...
2024-01-17 Georgi Gerganovimatrix : offload to GPU support (#4957)
2024-01-17 Georgi Gerganovbackend : add eval callback (#4935)
2024-01-17 Georgi Gerganovmetal : create autorelease pool during library build...
2024-01-17 Georgi Gerganovpy : fix whitespace
2024-01-17 Georgi Gerganovpy : fix missing added_tokens_dict for SPM and BPE...
2024-01-17 Kawrakowllama : use Q4_K for attn_v for Q2_K_S when n_gqa ...
2024-01-17 Paul Tsochantarismetal : remove unnecessary nil check (#4986)
2024-01-17 David Renshawllama : fix copy/paste error in llama_sampling_params...
2024-01-16 Georgi Gerganovpy : remove unnecessary hasattr (#4903)
2024-01-16 Philip Taronnix: remove nixConfig from flake.nix (#4984)
2024-01-16 Daniel Beveniusfinetune : add training data file to log message (...
2024-01-16 Kawrakowggml : importance matrix support for legacy quants...
2024-01-16 Maximilian... examples : add complete parallel function calling examp...
2024-01-16 Georgi Gerganovperplexity : fix kv cache handling for hellaswag (...
2024-01-16 Georgi Gerganovflake.lock: update flake-parts, flake-parts/nixpkgs...
2024-01-16 Paul Tsochantarismetal : localized logic in `ggml_metal_graph_compute...
2024-01-16 Neuman Vongandroid : introduce starter project example (#4926)
2024-01-16 Alex Azarovmetal : replace loop of dispatch_async with dispatch_ap...
2024-01-16 Alex Azarovmetal : log `recommendedMaxWorkingSetSize` on iOS 16...
2024-01-16 Maximilian... examples : fix and improv docs for the grammar generato...
2024-01-16 Justine Tunneyggml : introduce GGML_CALL function annotation (#4850)
2024-01-16 Daniel Beveniusfinetune : use LLAMA_FILE_MAGIC_GGLA (#4961)
2024-01-16 stduhpfspeculative : threading options (#4959)
2024-01-15 ngc92pass cpu-architecture arguments only to host code ...
2024-01-15 David Friehsllama : apply classifier-free guidance to logits direct...
2024-01-15 Victor Z. Pengawq-py : fix typo in awq-py/README.md (#4947)
2024-01-15 Georgi Gerganovcuda : fix dequantize kernel names (#4938)
2024-01-15 Kawrakowllama : check for 256 divisibility for IQ2_XS, IQ2_XXS...
2024-01-15 KawrakowCUDA: faster dequantize kernels for Q4_0 and Q4_1 ...
2024-01-14 David Pflugllama : fix missing quotes (#4937)
2024-01-14 KawrakowAdd ability to use importance matrix for all k-quants...
2024-01-14 Georgi Gerganovllama : check LLAMA_TRACE env for extra logging (#4929)
2024-01-14 Georgi Gerganovscripts : sync-ggml-am.sh option to skip commits
2024-01-14 Georgi Gerganovllama : use LLAMA_LOG_ macros for logging
2024-01-14 KawrakowFix ffn_down quantization mix for MoE models (#4927)
2024-01-14 Alex Azarovmetal : correctly set SIMD support flags on iOS (#4923)
2024-01-14 Karthik Kumar... llama : support WinXP build with MinGW 8.1.0 (#3419)
2024-01-14 Kawrakow2-bit quantizations (#4897)
2024-01-14 KawrakowMake Q3_K_S be the same as olf Q3_K_L for Mixtral-8x7B...
2024-01-13 Georgi Gerganovsync : ggml
2024-01-13 Johannes Gäßlerggml: cache sin/cos for RoPE (#4908)
2024-01-13 Georgi Gerganovmetal : remove old API (#4919)
2024-01-13 Georgi Gerganovserver : fix prompt caching with system prompt (#4914)
2024-01-13 Georgi Gerganovllama : fix detokenization of non-special added-tokens...
2024-01-13 Georgi Gerganovmetal : disable log for loaded kernels (#4794)
2024-01-13 David Friehsllama : minimize size used for state save/load (#4820)
2024-01-13 Someoneworkflows: unbreak nix-build-aarch64, and split it...
2024-01-13 Yann Folletmain : add parameter --no-display-prompt (#4541)
2024-01-13 texmex76gguf : fix potential infinite for-loop (#4600)
2024-01-13 Georgi Gerganovmetal : refactor kernel loading code (#4794)
2024-01-13 Johannes Gäßlercompare-llama-bench: tweak output format (#4910)
2024-01-13 Ziad Ben Hadj... server : fix deadlock that occurs in multi-prompt scena...
2024-01-13 makomkserver : fix crash with multimodal models without BOS...
2024-01-13 Georgi Gerganovconvert : update phi-2 to latest HF repo (#4903)
2024-01-12 Georgi Gerganovsync : ggml
2024-01-12 Georgi Gerganovggml : fix 32-bit ARM compat for IQ2_XS (whisper/1758)
2024-01-12 slarenbackend_sched : fix assignments
2024-01-12 Maximilian... examples : add pydantic models to GBNF grammar generato...
2024-01-12 Johannes GäßlerCUDA: faster q8_0 -> f16 dequantization (#4895)
2024-01-12 slarenllama : ggml-backend integration (#4766)
2024-01-12 Georgi Gerganovllama : remove redundant assert for StableLM (#4901)
2024-01-12 Daniel Beveniusexport-lora : use LLAMA_FILE_MAGIC_GGLA (#4894)
2024-01-12 Zayllama.swiftui : update models layout (#4826)
2024-01-12 Georgi Gerganovgitignore : imatrix
2024-01-12 Johannes GäßlerCUDA: fix softmax compile for old CUDA versions (#4862)
2024-01-12 Georgi Gerganovllama : fix typo "imp_embd" -> "inp_embd"
2024-01-12 howlgercommon : streamline the formatting of help (#4890)
2024-01-12 Georgi Gerganovpy : fix lint (#4889)
2024-01-12 Georgi Gerganovllama : fix llm_build_k_shift to use correct n_rot...
2024-01-12 KawrakowImportance Matrix calculation (#4861)
2024-01-11 Georgi Gerganovserver : fix infill when prompt is empty (#4833)
2024-01-11 Georgi Gerganovmain : better name for variable n_print (#4874)
2024-01-11 Georgi Gerganovmain : disable token count by default (#4874)
next