]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/shortlog
pkg/ggml/sources/llama.cpp
2024-08-30 tc-mbllava : the function "clip" should be int (#9237)
2024-08-29 Faisal ZaghloulThreadpool: take 2 (#8672)
2024-08-29 Jan Boonserver : fix crash when error handler dumps invalid...
2024-08-29 Georgi Gerganovflake.lock: Update (#9162)
2024-08-28 slarendocker : build images only once (#9225)
2024-08-28 slarendocker : update CUDA images (#9213)
2024-08-27 Georgi Gerganovvulkan : fix build (#0)
2024-08-27 Georgi Gerganovsync : ggml
2024-08-27 Xie YanboFix minicpm example directory (#9111)
2024-08-27 compiladellama : fix qs.n_attention_wv for DeepSeek-V2 (#9156)
2024-08-27 Xuan Son Nguyenserver : add some missing env variables (#9116)
2024-08-27 CausalLMllama : fix ChatGLM4 wrong shape (#9194)
2024-08-27 Carsten Kragelund... llama : fix llama3.1 rope_freqs not respecting custom...
2024-08-27 arch-btwcommon : Update stb_image.h to latest version (#9161)
2024-08-26 slarenggml : do not crash when quantizing q4_x_x with an...
2024-08-26 Georgi Gerganovmetal : separate scale and mask from QKT in FA kernel...
2024-08-26 Georgi Gerganovggml : add SSM Metal kernels (#8546)
2024-08-26 Georgi Gerganovtests : fix compile warnings for unreachable code ...
2024-08-26 Georgi Gerganovci : add VULKAN support to ggml-ci (#9055)
2024-08-26 Georgi Gerganovserver : update deps (#9183)
2024-08-26 slarenmetal : gemma2 flash attention support (#9159)
2024-08-26 slarenggml-ci : try to improve build time (#9160)
2024-08-26 Justine Tunneyllama : fix time complexity of string replacement ...
2024-08-25 Herman Semenovcommon: fixed not working find argument --n-gpu-layers...
2024-08-25 Johannes GäßlerCUDA: fix Gemma 2 numerical issues for FA (#9166)
2024-08-24 Johannes GäßlerCPU/CUDA: Gemma 2 FlashAttention support (#8542)
2024-08-24 João Dinis... quantize : fix typo in usage help of `quantize.cpp...
2024-08-23 Xuan Son Nguyenlora : fix llama conversion script with ROPE_FREQS...
2024-08-23 piDackllama : use F32 precision in GLM4 attention and no...
2024-08-22 Akarshan Biswas[SYCL] Add a space to supress a cmake warning (#9133)
2024-08-22 luoyu-intel[SYCL] Add oneDNN primitive support (#9091)
2024-08-21 compiladellama : simplify Mamba with advanced batch splits ...
2024-08-21 Xuan Son Nguyenserver : support reading arguments from environment...
2024-08-21 Younes Belkadallama : support for `falcon-mamba` architecture (#9074)
2024-08-21 fairydreamingllava : zero-initialize clip_ctx structure fields with...
2024-08-21 Daniel Beveniusllama : std::move llm_bigram_bpe from work_queue (...
2024-08-20 Changyeon Kimllava: Add ACC OP for GPU acceleration to the Vulkan...
2024-08-20 Meng, Hengyu[SYCL] fallback mmvq (#9088)
2024-08-20 zhentaoyu[SYCL] Fix SYCL `im2col` and `convert` Overflow with...
2024-08-20 fairydreamingtests : add missing comma in grammar integration tests...
2024-08-19 wangshuai09cann: add doc for cann backend (#8867)
2024-08-19 Radoslav Gerganovrpc : print error message when failed to connect endpoi...
2024-08-19 Radoslav Gerganovrpc : prevent crashes on invalid input (#9040)
2024-08-18 Georgi Gerganovflake.lock: Update (#9068)
2024-08-18 ltoniazzitests : add integration test for lora adapters (#8957)
2024-08-17 Yoshi SuharaFix incorrect use of ctx_split for bias tensors (#9063)
2024-08-16 Xuan Son Nguyenserver : refactor middleware and /health endpoint ...
2024-08-16 tc-mbllava : support MiniCPM-V-2.6 (#8967)
2024-08-16 Farbod Bijarypy : fix wrong input type for raw_dtype in ggml to...
2024-08-16 AisukoFix inference example lacks required parameters (#9035)
2024-08-16 compiladegguf-py : bump version from 0.9.1 to 0.10.0 (#9051)
2024-08-16 Minsoo Cheongllama : add EXAONE model support (#9025)
2024-08-16 Liu Jiacommon : add support for cpu_get_num_physical_cores...
2024-08-16 Yoshi SuharaAdd Nemotron/Minitron GGUF Conversion & Inference Suppo...
2024-08-16 Nico Bosshardggml : dynamic ggml_sched_max_splits based on graph_siz...
2024-08-15 gtygoretrieval : fix memory leak in retrieval query handling...
2024-08-15 Riceball LEEserver : fix duplicated n_predict key in the generation...
2024-08-15 Zhenwei Jincommon : remove duplicate function llama_should_add_bos...
2024-08-15 Esko Toivonenllama : add pre-tokenizer regexes for BLOOM and gpt3...
2024-08-15 Georgi Gerganovci : disable bench workflow (#9010)
2024-08-15 Jiří Podivínserver : init stop and error fields of the result struc...
2024-08-14 0cc4mVulkan Optimizations and Fixes (#8959)
2024-08-14 compiladeserver : fix segfault on long system prompt (#8987)
2024-08-14 Georgi Gerganovcmake : remove unused option GGML_CURL (#9011)
2024-08-13 Daniel Beveniusggml : move rope type enum to ggml.h (#8949)
2024-08-13 Xuan Son Nguyenexport-lora : throw error if lora is quantized (#9002)
2024-08-12 Diogo Teles... ci : fix github workflow vulnerable to script injection...
2024-08-12 Radoslav Gerganovci : enable RPC in all of the released builds (#9006)
2024-08-12 Nico Bosshardllama : model-based max number of graph nodes calculati...
2024-08-12 Frank Maidocs: introduce gpustack and gguf-parser (#8873)
2024-08-12 DavidKorczynskigrammar-parser : fix possible null-deref (#9004)
2024-08-12 DavidKorczynskiggml: fix div-by-zero (#9003)
2024-08-12 Liu JiaFix a spelling mistake (#9001)
2024-08-12 Georgi Gerganovpy : fix requirements check '==' -> '~=' (#8982)
2024-08-12 Georgi Gerganovserver : handle models with missing EOS token (#8997)
2024-08-11 compiladegguf-py : Numpy dequantization for most types (#8939)
2024-08-11 Georgi Gerganovflake.lock: Update (#8979)
2024-08-11 Neo Zhangupdate guide (#8909)
2024-08-11 fairydreamingllama : check all graph nodes when searching for result...
2024-08-11 Markus TavenrathOptimize Vulkan backend for better CPU performance...
2024-08-10 slarenmetal : fix uninitialized abort_callback (#8968)
2024-08-10 Xuan Son Nguyenllama : default n_swa for phi-3 (#8931)
2024-08-10 fairydreamingAdd support for encoder-only T5 models (#8900)
2024-08-10 Matteo Mortarigguf-py : fix double call to add_architecture() (#8952)
2024-08-09 Georgi GerganovMerge commit from fork
2024-08-09 fairydreamingllama : add support for lora adapters in T5 model ...
2024-08-09 Georgi Gerganovmake : fix llava obj file race (#8946)
2024-08-09 Georgi Gerganovllama : better replace_all (cont) (#8926)
2024-08-09 tc-mbllava : support MiniCPM-V-2.5 (#7599)
2024-08-09 Georgi Gerganovsync : ggml
2024-08-09 Matt Stephensonwhisper : use vulkan as gpu backend when available...
2024-08-09 Daniel Beveniusembedding : add --pooling option to README.md [no ci...
2024-08-09 Daniel Beveniusllama : fix typo in llama_tensor_get_type comment ...
2024-08-09 Mathieu Geliserver : add one level list nesting for embeddings...
2024-08-09 compiladellama : reduce useless copies when saving session ...
2024-08-08 compiladegguf-py : simplify support for quant types (#8838)
2024-08-08 Georgi Gerganovscripts : sync cann files (#0)
2024-08-08 Georgi Gerganovscripts : fix sync filenames (#0)
2024-08-08 Georgi Gerganovsync : ggml
2024-08-08 Borislav Stanimirovggml : ignore more msvc warnings (ggml/906)
next