]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2024-05-28 Masaya, Katoggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0...
2024-05-28 Georgi Gerganovggml : silence UB sanitizer error during iq2_xxs quanti...
2024-05-28 Georgi Gerganovggml : remove ggml_flash_attn and ggml_flash_ff (llama...
2024-05-28 Georgi Gerganovggml : drop support for QK_K=64 (llama/7473)
2024-05-28 0cc4mUpdate vulkan rope implementation to support frequency...
2024-05-28 Johannes GäßlerCUDA: fix FA out-of-bounds reads (llama/7479)
2024-05-28 Johannes GäßlerCUDA: fix FA out-of-bounds writes (llama/7465)
2024-05-28 Georgi Gerganovcuda : fix compile warning (llama/7454)
2024-05-28 Johannes GäßlerCUDA: remove incorrect precision check (llama/7454)
2024-05-28 Georgi Gerganovcuda : fix rope + add tests (llama/7452)
2024-05-28 liuwei-gitllama : add phi3 128K model support (llama/7225)
2024-05-28 Georgi Gerganovmetal : handle F16 inf values, fix FA partial offload...
2024-05-28 Johannes GäßlerCUDA: fix unused warning in mmq.cu (llama/7442)
2024-05-28 Johannes GäßlerCUDA: deduplicate mmq code (llama/7397)
2024-05-28 Radoslav Gerganovrpc : track allocated buffers (llama/7411)
2024-05-28 AidanBeltonSUpdate SYCL upscale operation (llama/7321)
2024-05-28 Herman Semenovggml-opencl, llama: using reserve() if count already...
2024-05-28 junchao-loongsonggml : add loongarch lsx and lasx support (llama/6454)
2024-05-28 Srihari-mcwAdd provisions for windows support for BF16 code includ...
2024-05-28 0cc4mVulkan Embedding Fix (llama/7360)
2024-05-28 slarenggml : fix another case of quants nans (llama/7387)
2024-05-28 Johannes Gäßlerggml: implement quantized KV cache for FA (llama/7372)
2024-05-28 slarencuda : clear error after buffer allocation failure...
2024-05-28 fraxy-vCapture CUDA logging output (llama/7298)
2024-05-28 Georgi Gerganovandroid : use "ci-android" branch for CI (llama/7341)
2024-05-28 Johannes GäßlerCUDA: deduplicate FlashAttention code (llama/7352)
2024-05-28 Engininja2cuda : add half2 __shfl_xor() for ROCm 5.5 (llama/7263)
2024-05-28 0cc4mUpdate and fix Vulkan soft_max and argsort implementati...
2024-05-28 slarenggml : fix quants nans when all the group weights are...
2024-05-28 Johannes GäßlerCUDA: faster large batch FA without tensor cores (llama...
2024-05-28 Radoslav Gerganovrpc : set SO_REUSEADDR for the server socket (llama...
2024-05-28 Herman Semenovggml-quants, llama : removed excess checks (llama/7274)
2024-05-28 Justine Tunneyggml : rewrite silu and softmax for cpu (llama/7154)
2024-05-28 Radoslav Gerganovrpc : add command line arg for specifying backend memory
2024-05-28 Max KrasnyanskyAdd support for properly optimized Windows ARM64 builds...
2024-05-28 kunnisggml : use dynamic thread scheduling for matrix multipl...
2024-05-28 agray3Avoid unnecessarily disabling CUDA graphs (llama/7302)
2024-05-28 slarenggml : tag ggml_tensor::backend as deprecated (llama...
2024-05-28 AidanBeltonSAdd missing " (llama/7303)
2024-05-25 Andreicmake : add Vulkan build (#730)
2024-05-24 compiladegguf : use Qn_K for k-quants instead of KQn (#837)
2024-05-19 Briangguf.md: add sharding to naming convention (#826)
2024-05-17 AndreiAdd ggml rpc to cmake (#827)
2024-05-17 Briangguf.md: Add GGUF Naming Convention Section (#822)
2024-05-15 John Balisggml : add `ggml_upscale_ext` (#814)
2024-05-15 Georgi Gerganovsync : whisper.cpp
2024-05-15 Georgi Gerganovwhisper : use flash attention (whisper/2152)
2024-05-14 Georgi Gerganovsync : llama.cpp
2024-05-14 Georgi Gerganovmetal : support FA without mask + add asserts (llama...
2024-05-14 Radoslav Gerganovggml : add RPC backend (llama/6829)
2024-05-14 Neo Zhangrm wait() (llama/7233)
2024-05-14 Johannes GäßlerCUDA: add FP32 FlashAttention vector kernel (llama...
2024-05-14 Georgi Gerganovscripts : sync ggml-rpc
2024-05-14 Georgi Gerganovsync : whisper.cpp
2024-05-14 thewh1teaglewhisper : fix model path encoding in windows (whisper...
2024-05-14 Daniel Ziegenbergmain : dont print timings with --no-prints (whisper...
2024-05-14 Daniel Ziegenbergmain : add options for temperature control (whisper...
2024-05-14 Georgi Gerganovwhisper : switch back to F32 mask (whisper/0)
2024-05-14 mashizoramain : fix double quote escaping in csv output (whisper...
2024-05-14 Georgi Gerganovmetal : tune soft_max number of threads (whisper/0)
2024-05-14 Georgi Gerganovwhisper : remove old flash attn code (whisper/0)
2024-05-14 Georgi Gerganovggml : try fix ppc64 (whisper/0)
2024-05-14 Przemysław... ggml : expose SSE3 and SSSE3 for MSVC when AVX is avail...
2024-05-14 goldwavingRemove unnecessary memory reallocation in fft (whisper...
2024-05-14 Georgi Gerganovwhisper : more prominent log message for sub-1s audio...
2024-05-14 Georgi Gerganovmain : pass nullptr when regex is empty (whisper/2070)
2024-05-14 Ikko Eltociear... whisper : update grammar-parser.cpp (whisper/2058)
2024-05-12 Hong Bo PENGggml : optimize for ppc64le using VSX intrinsics (...
2024-05-11 Georgi Gerganovcuda : remove old alibi sources (#0)
2024-05-11 Georgi Gerganovmetal : fix indent (#0)
2024-05-11 Georgi Gerganovggml : restore sigmoid decl order (#0)
2024-05-11 Georgi Gerganovtests : restore unary tests (#0)
2024-05-11 Georgi Gerganovmnist : clean whitespace
2024-05-11 Georgi Gerganovggml : resolve merge (#0)
2024-05-11 Georgi Gerganovsync : llama.cpp
2024-05-11 Georgi Gerganovggml : full ALiBi support (llama/7192)
2024-05-11 Georgi Gerganovmetal : fix flash attention kernel requirements (llama...
2024-05-11 Ouadie EL FAROUKIMinor arithmetic improvement to mmvq wrapper kernel...
2024-05-11 0cc4mVulkan Bugfixes and Improvements (llama/7084)
2024-05-11 Johannes GäßlerCUDA: generalize FP16 fattn vec kernel (llama/7061)
2024-05-11 Albert Jinopencl : alignment size converted from bits to bytes...
2024-05-11 agray3Introduction of CUDA Graphs to LLama.cpp (llama/6766)
2024-05-11 Gilad Smetal : use `vm_allocate` instead of `posix_memalign...
2024-05-11 Justine Tunneyggml : introduce bfloat16 support (llama/6412)
2024-05-11 Georgi Gerganovmetal : fix unused warning
2024-05-11 William TambelliniAdd an option to build without CUDA VMM (llama/7067)
2024-05-11 Xuan Son Nguyengguf-split: add --no-tensor-first-split (llama/7072)
2024-05-11 Johannes GäßlerCUDA: CUDART < 11.7 workaround for __hmax, __hmax2...
2024-05-11 Kevin Gibbonsswitch to using localizedDescription (llama/7010)
2024-05-11 Georgi Gerganovmetal : remove deprecated error code (llama/7008)
2024-05-11 Kevin Gibbonsmetal : log more info on error (llama/6987)
2024-05-11 Georgi Gerganovggml : add Flash Attention (llama/5021)
2024-05-11 Georgi Gerganovggml : fix __MSC_VER -> _MSC_VER (llama/6977)
2024-05-11 DAN™Fix more int overflow during quant (PPL/CUDA). (llama...
2024-05-11 Xuan Son Nguyengguf : enforce that tensor names are unique (llama...
2024-05-11 Neo Zhangadd device version in device list (llama/6959)
2024-05-11 agray3Reset schedule earlier to allow overlap with ggml graph...
2024-05-11 slarenadd basic tensor data validation function (llama/6884)
2024-05-11 slarengguf : fix mismatch between alloc and free functions...
2024-05-11 Georgi GerganovMerge pull request from GHSA-p5mv-gjc5-mwqv
next