]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2024-12-03 Georgi Gerganovmetal : fix group_norm support condition (llama/0)
2024-12-03 Jeff Bolzvulkan: define all quant data structures in types.comp...
2024-12-03 Jeff Bolzvulkan: Handle GPUs with less shared memory (llama...
2024-12-03 Jeff Bolzvulkan: further optimize q5_k mul_mat_vec (llama/10479)
2024-12-03 Jeff Bolzvulkan: skip integer div/mod in get_offsets for batch_i...
2024-12-03 Jeff Bolzvulkan: optimize Q2_K and Q3_K mul_mat_vec (llama/10459)
2024-12-03 R0CKSTARmtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update...
2024-12-03 Jeff Bolzvulkan: fix group_norm (llama/10496)
2024-12-03 Georgi Gerganovcmake : enable warnings in llama (llama/10474)
2024-12-03 Charles Xuggml-cpu: cmake add arm64 cpu feature check for macos...
2024-12-03 Shanshan ShenCANN: Improve the Inferencing Performance for Ascend...
2024-12-03 Chenguang LiCANN: RoPE and CANCAT operator optimization (llama...
2024-12-03 Junil Kimvulkan: Fix a vulkan-shaders-gen arugment parsing error...
2024-12-03 Georgi Gerganovmetal : enable mat-vec kernels for bs <= 4 (llama/10491)
2024-12-03 Diego Devesallama : accept a list of devices to use to offload...
2024-12-03 Diego Devesaggml : add support for dynamic loading of backends...
2024-12-03 Georgi Gerganovtests : fix compile warning
2024-12-03 Georgi Gerganovmetal : minor code formatting
2024-12-03 Diego Devesaggml : do not use ARM features not included in the...
2024-12-03 leo-ponyCANN: Support Ascend310P to accelerate F32 and F16...
2024-12-03 Diego Devesacuda : optimize argmax (llama/10441)
2024-12-03 Jeff Bolzvulkan: predicate max operation in soft_max shaders...
2024-12-03 Jeff Bolzvulkan: copy iq4_nl LUT into shared memory (llama/10409)
2024-12-03 Jeff Bolzvulkan: further optimize mul_mat_vec using larger loads...
2024-12-03 haopengadd cmake rvv support (llama/10411)
2024-12-03 mahorozteCUDA: remove unnecessary warp reduce in FA (#1032)
2024-12-02 PABfeat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (#1019)
2024-11-28 PABmetal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (#1026)
2024-11-27 Johannes Gäßlerexamples: link to HuggingFace mirror of MNIST data...
2024-11-26 Tristan DruyenFix build docs for hip (#1029)
2024-11-26 Frankie RobertsonDo not include arm_neon.h when compiling CUDA code...
2024-11-20 M Refi D.ACreate .gitmodules for the kompute backend (#1024)
2024-11-20 Georgi Gerganovsync : whisper.cpp
2024-11-20 slarenggml/sched : do not skip views in pre-assignments
2024-11-20 Johannes Gäßlerggml-opt: fix data corruption (#1022)
2024-11-19 Georgi Gerganovsync : llama.cpp
2024-11-19 bandotiAdd required ggml-base and backend libs to cmake pkg...
2024-11-19 Georgi Gerganovsync : llama.cpp
2024-11-19 Diego Devesacuda : fix CUDA_FLAGS not being applied (llama/10403)
2024-11-19 Georgi Gerganovsync : llama.cpp
2024-11-19 Romain Biessysycl : Add option to set the SYCL architecture for...
2024-11-19 Jeff Bolzvulkan: Optimize soft_max (llama/10301)
2024-11-19 Alberto Cabrera... sycl: Revert MUL_MAT_OP support changes (llama/10385)
2024-11-19 Diego Devesacuda : only use native when supported by cmake (llama...
2024-11-19 Jeff Bolzvulkan: remove use of null initializer (llama/10372)
2024-11-18 Plamen Minevmetal : fox offset integer overflows in im2col (#1015)
2024-11-18 Georgi Gerganovsync : llama.cpp
2024-11-18 0cc4mVulkan: Fix device info output format specifiers (llama...
2024-11-18 PABmetal : add `GGML_UNARY_OP_ELU` kernel (#1018)
2024-11-18 Georgi Gerganovsync : llama.cpp
2024-11-18 Johannes GäßlerCUDA: fix MMV kernel being used for FP16 src1 (llama...
2024-11-18 Georgi Gerganovsync : llama.cpp
2024-11-18 Johannes GäßlerCMake: fix typo in comment [no ci] (llama/10360)
2024-11-18 Diego Devesallama : only use default buffer types for the KV cache...
2024-11-18 Georgi Gerganovmetal : refactor kernel args into structs (llama/10238)
2024-11-18 FirstTimeEZggml : fix undefined reference to 'getcpu' (llama/10354)
2024-11-18 Johannes GäßlerCUDA: remove DMMV, consolidate F16 mult mat vec (llama...
2024-11-18 Johannes GäßlerCMake: default to -arch=native for CUDA build (llama...
2024-11-18 Diego Devesaggml : fix possible buffer use after free in sched...
2024-11-18 Georgi Gerganovggml : inttypes.h -> cinttypes (llama/0)
2024-11-18 Georgi Gerganovggml : adapt AMX to tensor->grad removal (llama/0)
2024-11-18 Georgi Gerganovggml : fix compile warnings (llama/0)
2024-11-18 Georgi Gerganovllamafile : fix include path (llama/0)
2024-11-18 Jeff Bolzvulkan: Optimize some mat-vec mul quant shaders (llama...
2024-11-18 Dan Johanssonggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324)
2024-11-18 Srihari-mcwMake updates to fix issues with clang-cl builds while...
2024-11-16 Johannes Gäßlerggml: new optimization interface (#988)
2024-11-15 Georgi Gerganovggml : remove duplicated sources from the last sync...
2024-11-15 Georgi Gerganovsync : llama.cpp
2024-11-15 slarenggml : fix some build issues
2024-11-15 Georgi Gerganovsync : leftovers (#0)
2024-11-15 Georgi Gerganovtest-dup : minor fix
2024-11-15 Georgi Gerganovcmake : restore CMakeLists.txt (llama/10256)
2024-11-15 EveAVX BF16 and single scale quant optimizations (llama...
2024-11-15 Romain Biessysycl: Use syclcompat::dp4a (llama/10267)
2024-11-15 Charles Xubackend cpu: add online flow for aarch64 Q4_0 GEMV...
2024-11-15 Diego Devesaggml : build backends as libraries (llama/10256)
2024-11-15 Georgi Gerganovscripts : update sync llama.cpp
2024-11-15 Georgi Gerganovsync : whisper.cpp
2024-11-15 Georgi Gerganovcmake : fix ppc64 check (whisper/0)
2024-11-15 thewh1teagleggml : vulkan logs (whisper/2547)
2024-11-13 Georgi Gerganovsync : llama.cpp
2024-11-13 Alberto Cabrera... sycl : Fixes to broken builds and test-backend-ops...
2024-11-13 Jeff Bolzvulkan: Optimize contiguous copies (llama/10254)
2024-11-13 Jeff Bolzvulkan: Throttle the number of shader compiles during...
2024-11-13 Georgi Gerganovmetal : more precise Q*K in FA vec kernel (llama/10247)
2024-11-13 Jeff Bolzvulkan: Fix newly added tests for permuted mul_mat...
2024-11-13 Georgi Gerganovmetal : reorder write loop in mul mat kernel + style...
2024-11-13 Georgi Gerganovmetal : fix build and some more comments (llama/10229)
2024-11-13 Georgi Gerganovmetal : fix F32 accumulation in FA vec kernel (llama...
2024-11-13 Georgi Gerganovmetal : hide debug messages from normal log
2024-11-13 SXXggml: fix zero division in ‘dne’ calculation in CUDA...
2024-11-13 amritahs-ibmggml : optimize llamafile cpu matrix multiplication...
2024-11-13 Georgi Gerganovmetal : opt-in compile flag for BF16 (llama/10218)
2024-11-13 Georgi Gerganovmetal : improve clarity (minor) (llama/10171)
2024-11-13 Georgi Gerganovmetal : optimize FA kernels (llama/10171)
2024-11-08 Georgi Gerganovsync : llama.cpp
2024-11-08 Diego Devesaggml : add ggml-cpu.h to the public headers (llama...
2024-11-08 snadampalfix q4_0_8_8 format for corrupted tokens issue (llama...
2024-11-08 Zhiyuan LiOptimize RWKV6 Operator Naming and Implement Multi...
next