]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2023-08-11 Equimserver: fixed wrong variable name in timing json (...
2023-08-10 DannyDaemonicHandle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracef...
2023-08-10 Christian DemsarAdd --n-predict -2 for stopping generation on full...
2023-08-10 Martin KrasserFix grammar-based sampling issue in server (#2566)
2023-08-09 Sam Spilsburyggml-alloc: Don't try to re-use buffers of external...
2023-08-09 grahamethadd log_callback to llama_context_params for custom...
2023-08-09 Johannes GäßlerCUDA: tuned mul_mat_q kernels (#2546)
2023-08-08 Martin KrasserAllow passing grammar to completion endpoint (#2532)
2023-08-08 Johannes GäßlerCUDA: tighter VRAM scratch size for 65b/70b (#2551)
2023-08-08 chaihahahallm.vim : multiline autocompletion, get rid of "^@...
2023-08-08 Georgi Gerganovvim : bring back simple llm.vim example
2023-08-08 AustinMrozvim : streaming and more (#2495)
2023-08-07 klosaxAdd --rope-scale parameter (#2544)
2023-08-07 Georgi Gerganovggml : mul mat tweaks (#2372)
2023-08-07 Georgi Gerganovggml : pad result of ggml_nbytes()
2023-08-07 Georgi Gerganovggml : change params pointer (style change) (#2539)
2023-08-07 Georgi Gerganovggml : sync (custom ops) (#2537)
2023-08-07 Johannes GäßlerFixed mmap prefetch for GPU offloading (#2529)
2023-08-07 Georgi Gerganovmetal : fix out-of-bounds access + inc concurrency...
2023-08-07 GiviMAD[Makefile] Move ARM CFLAGS before compilation (#2536)
2023-08-07 Henri Vasserman[Zig] Rewrite build for Zig 0.11 (#2514)
2023-08-06 DannyDaemonicconsole : fix issue related to Windows 11 PowerShell...
2023-08-06 Keiichi Tabataconvert.py : add missing abstract methods for quantized...
2023-08-05 Johannes GäßlerCUDA: faster k-quant mul_mat_q kernels (#2525)
2023-08-04 Jonas Wunderlichfix firefox autoscroll (#2519)
2023-08-04 Cebtenzzreserver: regenerate completion.js.hpp (#2515)
2023-08-04 CebtenzzreCUDA: use min compute capability of GPUs actually used...
2023-08-04 CebtenzzreCUDA: check if event is NULL before cudaStreamWaitEvent...
2023-08-04 DannyDaemonicAdd --simple-io option for subprocesses and break out...
2023-08-04 Stephen NicholsFixing race condition in server and partial stream...
2023-08-04 l3utterflyStream save llama context data to file instead of alloc...
2023-08-04 Borislav Stanimirovbuild : fix several cast and printf warnings (#2499)
2023-08-03 Evan Jonesexamples : generate JSON according to schema (#1887)
2023-08-02 Johannes GäßlerCUDA: faster non k-quant mul_mat_q kernels (#2483)
2023-08-02 Johannes GäßlerCUDA: Fix models with output size != 32000 (#2480)
2023-08-02 ldwangreadme : add Aquila-7B model series to supported models...
2023-08-02 Evetests : Fix compilation warnings (Linux/GCC) (#2451)
2023-08-02 Yiming Cuireadme : Add Chinese LLaMA-2 / Alpaca-2 to supported...
2023-08-01 Bono Lvfix a typo in examples/server/README.md (#2478)
2023-08-01 ebraminioserver : Support dark mode (#2414)
2023-08-01 Matteo Boschinimetal : add gqa8 kernel to allow llama-2-70B on metal...
2023-07-31 Johannes GäßlerCUDA: fixed LLAMA_FAST compilation option (#2473)
2023-07-31 Johannes GäßlerCUDA: fixed cmake F16 option (#2471)
2023-07-31 Johannes GäßlerCUDA: mmq CLI option, fixed mmq build issues (#2453)
2023-07-31 Johannes GäßlerCUDA: Implemented row flattening for non-glm RoPE ...
2023-07-31 Johannes GäßlerCUDA: fewer memory bank conflicts for mul_mat_q (#2458)
2023-07-31 slarenFix Metal backend broken from the allocator changes...
2023-07-30 slarenggml : add graph tensor allocator (#2411)
2023-07-29 Johannes GäßlerCUDA: Quantized matrix matrix multiplication (#2160)
2023-07-29 Johannes GäßlerCUDA: faster multi GPU synchronization (#2448)
2023-07-28 klosaxperplexity : add Hellaswag calculation (#2389)
2023-07-28 Leeggml : workaround for missing _mm256_setr_m128i in...
2023-07-28 eric8607242llama : support more diverse tokenizers? (#2420)
2023-07-28 Georgi Gerganovexamples : fix whitespace
2023-07-28 nhamanasuexamples : server chat mode with llama2 (#2400)
2023-07-28 Weird Constructorreadme : fix the description of the Tail free sampling...
2023-07-28 Rand Xiellama : use n_embd_gqa instead of n_embd to handle...
2023-07-28 niansa/tuxifanObtaining LLaMA 2 instructions (#2308)
2023-07-27 mj-shifuconvert.py : Update to support 70B HF format model...
2023-07-27 Georgi Gerganovmetal : disable graph concurrency optimization due...
2023-07-26 slarenggml : fix assert in ggml_set_unary_op (#2410)
2023-07-26 Cebtenzzremake : build with -Wmissing-prototypes (#2394)
2023-07-26 slarenggml : allocate graphs in a context (#2392)
2023-07-25 KawrakowAdd LLAMA_DEFAULT_RMS_EPS so we can change the default...
2023-07-25 slarenggml : fix ggml_flash_attn to use op_params (#2387)
2023-07-25 ldwangconvert.py : support bpe tokenizer (#2228)
2023-07-25 Jiahao Liggml : relax contiguous constraints in activation funct...
2023-07-25 slarenggml : improve graph build time via hash table lookup...
2023-07-25 Hesen Pengbuild : fix line breaking error in build-info.sh (...
2023-07-25 Xiao-Yong Jinmain : add `--in-prefix-bos` to prefix BOS to user...
2023-07-25 Eveci : add non-AVX scalar build/test (#2356)
2023-07-25 katsu560k_quants : add AVX support to dot functions with QK_K...
2023-07-25 Shouzheng Liumetal : concurrently dispatch commands (#2358)
2023-07-25 KawrakowAnother speed gain for Q4_0 and Q4_1 on Metal (#2375)
2023-07-25 KawrakowFix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)
2023-07-25 slarenserver: add rms_norm_eps parameter (#2380)
2023-07-25 Henri Vasserman[Server] Escape HTML in webchat (#2368)
2023-07-24 slarenmake rms_norm_eps a parameter (#2374)
2023-07-24 Aarni KoskelaChat UI extras (#2366)
2023-07-24 Georgi Gerganovggml : sync (unary ops refactor, static-correctness...
2023-07-24 KawrakowFix scalar version of Q5_K when QK_K = 64 (#2362)
2023-07-24 Evan Jonesllama : add grammar-based sampling (#1773)
2023-07-23 KawrakowSome more Q4_K and Q5_K speedup on CUDA (#2346)
2023-07-23 IgnacioFDMAdd gqa parameter support to the server (#2351)
2023-07-23 Johannes GäßlerFix __dp4a documentation (#2348)
2023-07-23 wzycommon : n_threads == -1 uses std::thread::hardware_con...
2023-07-23 slarenfix n_tasks (#2342)
2023-07-23 slarenggml: move op parameters from tensors to ggml_tensor...
2023-07-23 Georgi Gerganovllama : grouped-query attention + LLaMAv2 70B support...
2023-07-23 maddes8chtllama : print help to stdout (#2338)
2023-07-23 wzyflake : support `nix build '.#opencl'` (#2337)
2023-07-23 Christian Demsarllama : print max tensor size to stderr (#2336)
2023-07-23 Jose Maldonadomake : fix CLBLAST compile support in FreeBSD (#2331)
2023-07-23 AustinMrozexamples : simplify vim plugin (#2327)
2023-07-23 Jiahao Limetal : support bcast add & dup & cont op (#2323)
2023-07-23 KawrakowSpeed up Q4_K (#2322)
2023-07-22 Johannes GäßlerCUDA: Fixed 7b q3_K_S with mul_mat_vec_q (#2313)
2023-07-22 Georgi Gerganovllama : optimize memory buffers (#2325)
2023-07-22 klosaxPerplexity: Compute scores correlated to HellaSwag...
2023-07-22 whoresonexamples : basic VIM plugin
next