]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2024-06-15 Radoslav Gerganovrpc : fix ggml_backend_rpc_supports_buft() (llama/7918)
2024-06-15 slarenmove BLAS to a separate backend (llama/6210)
2024-06-15 Johannes GäßlerCUDA: fix broken oob check for FA vec f32 kernel (llama...
2024-06-15 Georgi Gerganovtests : add non-cont unary tests (llama/7857)
2024-06-15 Georgi Gerganovggml : improve ggml_is_contiguous logic (llama/7856)
2024-06-15 k.h.laivulkan: select only one device for single gpu with...
2024-06-15 0cc4mUpdate Vulkan RoPE implementation (llama/7818)
2024-06-15 Johannes GäßlerCUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)...
2024-06-15 Johannes GäßlerCUDA: use tensor cores for MMQ (llama/7676)
2024-06-15 Ben Ashbaughuse the correct SYCL context for host USM allocations...
2024-06-15 Johannes GäßlerCUDA: revise q8_1 data layout for mul_mat_q (llama...
2024-06-15 slarenvulkan : reuse parent extra for views (llama/7806)
2024-06-15 pengxin99fix softmax r2r result wrong issue (llama/7811)
2024-06-15 Johannes GäßlerCUDA: refactor mmq, dmmv, mmvq (llama/7716)
2024-06-15 Georgi Gerganovggml : refactor rope norm/neox (llama/7634)
2024-06-15 agray3Allow number of nodes in CUDA graph to change (llama...
2024-06-15 Georgi Gerganovggml : remove OpenCL (llama/7735)
2024-06-15 Georgi Gerganovggml : prevent builds with -ffinite-math-only (llama...
2024-06-15 Radoslav Gerganovllama : offload to RPC in addition to other backends...
2024-06-15 Masaya, Katoggml : use OpenMP as a thread pool (llama/7606)
2024-06-15 0cc4mVulkan Mixture of Experts (MoE) support (llama/7628)
2024-06-15 woachkkompute : implement op_getrows_f32 (llama/6403)
2024-06-15 Dave Airliefix bug introduced in using calloc (llama/7701)
2024-06-15 Johannes GäßlerFix FlashAttention debug test, FP32 assert (llama/7684)
2024-06-15 Johannes GäßlerCUDA: fix Pascal FA, deq. KV to FP16 for batch > 8...
2024-06-15 Johannes GäßlerCUDA: quantized KV support for FA vec (llama/7527)
2024-06-15 Georgi Gerganovggml : fix loongson compile warnings (llama/7537)
2024-06-15 Chris Elrodfaster avx512 exp implementation (llama/7551)
2024-06-15 junchao-loongsonggml : fix loongarch build (O2 issue) (llama/7636)
2024-06-15 Georgi Gerganovmetal : remove invalid asserts (llama/7617)
2024-06-15 Georgi Gerganovmetal : add missing asserts (llama/7617)
2024-06-15 Georgi Gerganovggml : fix YARN + add tests + add asserts (llama/7617)
2024-06-15 Georgi Gerganovcuda : non-cont concat support (llama/7610)
2024-06-15 Radoslav Gerganovllama-bench : add support for the RPC backend (llama...
2024-06-15 slarenggml : use atomic_flag for critical section (llama...
2024-06-05 Danielecmake : update HIPBLAS (#847)
2024-06-05 Emmanuel Durandzig : fix build (#840)
2024-05-29 Georgi Gerganovsync : llama.cpp
2024-05-29 Georgi Gerganovexamples : adapt to new ggml_concat (#0)
2024-05-29 zhouwgggml : fix typo in ggml.c (llama/7603)
2024-05-29 Meng, HengyuAlign GEMM dispatch (llama/7566)
2024-05-29 Georgi Gerganovsycl : fix assert (llama/7563)
2024-05-29 k.h.laivulkan: properly initialize vulkan devices for LLAMA_SP...
2024-05-29 Radoslav Gerganovrpc : resource management rework (llama/7562)
2024-05-29 Neo Zhangfix ggml_sycl_mul_mat_id() to match the change of api...
2024-05-29 Georgi Gerganovggml : generalize GGML_OP_CONCAT (llama/7563)
2024-05-29 Djip007update HIP_UMA #7399 (llama/7414)
2024-05-29 agray3Allow multiple copy function pointers for CUDA graph...
2024-05-29 AidanBeltonSFix q_xxs using mul_mat_q (llama/7459)
2024-05-29 AidanBeltonSAdd freq factors (llama/7495)
2024-05-29 Georgi Gerganovmetal : add GGML_OP_REPEAT kernels (llama/7557)
2024-05-29 Georgi Gerganovmetal : disable FA kernel for HS=256 (llama/7556)
2024-05-28 Georgi Gerganovggml : restore ggml_rope_xpos_inplace (#0)
2024-05-28 Georgi Gerganovsync : llama.cpp
2024-05-28 Masaya, Katoggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0...
2024-05-28 Georgi Gerganovggml : silence UB sanitizer error during iq2_xxs quanti...
2024-05-28 Georgi Gerganovggml : remove ggml_flash_attn and ggml_flash_ff (llama...
2024-05-28 Georgi Gerganovggml : drop support for QK_K=64 (llama/7473)
2024-05-28 0cc4mUpdate vulkan rope implementation to support frequency...
2024-05-28 Johannes GäßlerCUDA: fix FA out-of-bounds reads (llama/7479)
2024-05-28 Johannes GäßlerCUDA: fix FA out-of-bounds writes (llama/7465)
2024-05-28 Georgi Gerganovcuda : fix compile warning (llama/7454)
2024-05-28 Johannes GäßlerCUDA: remove incorrect precision check (llama/7454)
2024-05-28 Georgi Gerganovcuda : fix rope + add tests (llama/7452)
2024-05-28 liuwei-gitllama : add phi3 128K model support (llama/7225)
2024-05-28 Georgi Gerganovmetal : handle F16 inf values, fix FA partial offload...
2024-05-28 Johannes GäßlerCUDA: fix unused warning in mmq.cu (llama/7442)
2024-05-28 Johannes GäßlerCUDA: deduplicate mmq code (llama/7397)
2024-05-28 Radoslav Gerganovrpc : track allocated buffers (llama/7411)
2024-05-28 AidanBeltonSUpdate SYCL upscale operation (llama/7321)
2024-05-28 Herman Semenovggml-opencl, llama: using reserve() if count already...
2024-05-28 junchao-loongsonggml : add loongarch lsx and lasx support (llama/6454)
2024-05-28 Srihari-mcwAdd provisions for windows support for BF16 code includ...
2024-05-28 0cc4mVulkan Embedding Fix (llama/7360)
2024-05-28 slarenggml : fix another case of quants nans (llama/7387)
2024-05-28 Johannes Gäßlerggml: implement quantized KV cache for FA (llama/7372)
2024-05-28 slarencuda : clear error after buffer allocation failure...
2024-05-28 fraxy-vCapture CUDA logging output (llama/7298)
2024-05-28 Georgi Gerganovandroid : use "ci-android" branch for CI (llama/7341)
2024-05-28 Johannes GäßlerCUDA: deduplicate FlashAttention code (llama/7352)
2024-05-28 Engininja2cuda : add half2 __shfl_xor() for ROCm 5.5 (llama/7263)
2024-05-28 0cc4mUpdate and fix Vulkan soft_max and argsort implementati...
2024-05-28 slarenggml : fix quants nans when all the group weights are...
2024-05-28 Johannes GäßlerCUDA: faster large batch FA without tensor cores (llama...
2024-05-28 Radoslav Gerganovrpc : set SO_REUSEADDR for the server socket (llama...
2024-05-28 Herman Semenovggml-quants, llama : removed excess checks (llama/7274)
2024-05-28 Justine Tunneyggml : rewrite silu and softmax for cpu (llama/7154)
2024-05-28 Radoslav Gerganovrpc : add command line arg for specifying backend memory
2024-05-28 Max KrasnyanskyAdd support for properly optimized Windows ARM64 builds...
2024-05-28 kunnisggml : use dynamic thread scheduling for matrix multipl...
2024-05-28 agray3Avoid unnecessarily disabling CUDA graphs (llama/7302)
2024-05-28 slarenggml : tag ggml_tensor::backend as deprecated (llama...
2024-05-28 AidanBeltonSAdd missing " (llama/7303)
2024-05-25 Andreicmake : add Vulkan build (#730)
2024-05-24 compiladegguf : use Qn_K for k-quants instead of KQn (#837)
2024-05-19 Briangguf.md: add sharding to naming convention (#826)
2024-05-17 AndreiAdd ggml rpc to cmake (#827)
2024-05-17 Briangguf.md: Add GGUF Naming Convention Section (#822)
2024-05-15 John Balisggml : add `ggml_upscale_ext` (#814)
2024-05-15 Georgi Gerganovsync : whisper.cpp
next