]> git.djapps.eu Git - pkg/ggml/sources/ggml/shortlog
pkg/ggml/sources/ggml
2025-05-01 SXXggml: move fp16/bf16 conversion optimizations to CPU...
2025-05-01 Xuan-Son Nguyenclip : fix pixtral on some GPU backends (llama/13097)
2025-05-01 Neo Zhang Jianyuchange the reorder tensor from init to execute OP ...
2025-05-01 Radoslav Gerganovrpc : do not wait for response when sending RPC_CMD_SET...
2025-04-30 Diego Devesaggml : fix ggml_gallocr_ptr type (#1205)
2025-04-30 Georgi Gerganovmedia : rm logos (#1203)
2025-04-29 Georgi Gerganovsync : whisper.cpp
2025-04-29 Georgi Gerganovcuda : fix unused variable compile warning (whisper/0)
2025-04-24 Georgi Gerganovopencl : remove obsolete files (skip) (#1200)
2025-04-24 Georgi Gerganovsync : llama.cpp upstream/0.0.1982
2025-04-24 Georgi Gerganovmetal : add memory pool for temp allocs (llama/12850)
2025-04-24 lhezopencl: split ggml-opencl.cl into multiple files and...
2025-04-24 Georgi Gerganovggml : fix trailing whitespaces (llama/0)
2025-04-24 Johannes GäßlerCUDA: use switch statements in constexpr functions...
2025-04-24 Georgi Gerganovmetal : fix floating-point range of attention scores...
2025-04-24 Evevulkan: matmul gcn tuning (llama/13016)
2025-04-24 Johannes GäßlerCUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama...
2025-04-24 Diego Devesaggml : add SSE 4.2 and x64 base variant for CPUs withou...
2025-04-24 Akarshan BiswasSYCL: Add non-contiguous support in ROPE (llama/12993)
2025-04-24 Jeff Bolzvulkan: support noncontiguous rms_norm (llama/13031)
2025-04-24 Jeffrey Morganmetal: add neg operator (llama/13029)
2025-04-24 Akarshan BiswasSYCL: Refactor and enable FP16 in binary broadcast...
2025-04-24 Radoslav Gerganovrpc : add RPC_CMD_HELLO (llama/12955)
2025-04-24 Georgi Gerganovgraph : make FA compatible with MLA + add initial Metal...
2025-04-24 Alan Grayggml: Re-enable CUDA graphs in presence of CONT and...
2025-04-24 hipuddingCANN: Add support for async operator submission (llama...
2025-04-24 kimminsuopencl: fix incorrect local_size index in profiling...
2025-04-24 Jeff Bolzvulkan: enable coopmat2 FA gqa and split_k optimization...
2025-04-24 Chenguang LiCANN: Add 310P operator support check (llama/12962)
2025-04-24 Georgi Gerganovmetal : add FA-vec kernels for head size 96 (llama...
2025-04-24 hipuddingCANN: Add x86 build ci (llama/12950)
2025-04-24 David HuangCUDA/HIP: Share the same unified memory allocation...
2025-04-24 Akarshan BiswasSYCL: Add ROPE vision kernel (llama/12887)
2025-04-24 Srihari-mcwggml : Add AVX512 implementation of GEMM - Q4_Kx8 ...
2025-04-24 Chenguang LiCANN: Opt ROPE optimization (llama/12865)
2025-04-24 Xinpeng DouCANN: Optimize CANN buffer pool memory management ...
2025-04-24 Akarshan BiswasSYCL: Fix im2col (llama/12910)
2025-04-24 Radoslav Gerganovrpc : use ggml_context_ptr (llama/12938)
2025-04-24 Georgi Gerganovscripts : update sync-llama-am.sh
2025-04-19 Leonard Mosescutests : Fix a few small Windows / MSVC build issues...
2025-04-17 Aclyggml : Depthwise 2D convolution (#1152)
2025-04-14 Georgi Gerganovsync : llama.cpp
2025-04-14 SXXggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly...
2025-04-14 Alan Grayggml: disable CUDA graphs for unsupported DUP and CONT...
2025-04-14 Jeff Bolzvulkan: use aligned loads for flash attention mask...
2025-04-14 Ewan Crawfordsycl: Support sycl_ext_oneapi_limited_graph (llama...
2025-04-14 Akarshan BiswasSYCL: Add fp16 type support to unary op kernels (llama...
2025-04-14 Aaron Teoggml: fix compilation error s390x (llama/12848)
2025-04-14 Georgi Gerganovtests : fix init order (llama/0)
2025-04-11 cmdr2cpu: fix cpu backend's supports-op for GET_ROWS_BACK...
2025-04-10 Georgi Gerganovsync : fix (skip) (#0)
2025-04-10 Georgi Gerganovsync : llama.cpp
2025-04-10 Chenguang LiCANN: Support more ops (llama/12841)
2025-04-10 Prajwal B MehendarkarFixes #12823 (llama/12830)
2025-04-10 Piotr Kubajggml-cpu-impl.h: do not redefine bool on POWER9 (llama...
2025-04-10 Piotr Kubajggml-impl.h: fix build on POWER9 (llama/12855)
2025-04-10 Chenguang LiCANN: Support Opt CONV_TRANSPOSE_1D and ELU (llama...
2025-04-10 Jeff Bolzvulkan: In coopmat2 mmq, load q4_k/q5_k scales through...
2025-04-10 Jeff Bolzvulkan: Use fp16 for the flash attention P*V multiplica...
2025-04-10 Sigbjørn Skjæretcuda : add f32 to bf16 copy op (llama/12806)
2025-04-10 Georgi Gerganovllama : fix FA when KV cache is not used (i.e. embeddin...
2025-04-10 cmdr2ggml: don't include arm_neon.h when using CUDA 12 with...
2025-04-09 Diego Devesaggml : add bilinear upscale support (#1185)
2025-04-09 Diego Devesaggml : add more generic custom op, remove deprecated...
2025-04-08 Georgi Gerganovsync : llama.cpp
2025-04-08 Neo Zhang JianyuRevert "sycl:remove redundant memcopy in function ggml_...
2025-04-08 lhezopencl: better identify Adreno GPU (llama/12760)
2025-04-08 Georgi Gerganovcuda : fix HIP and MUSA BF16 (llama/0)
2025-04-08 zhouwgsycl: remove redundant memcopy in function ggml_backend...
2025-04-08 zhouwgCANN: fix typo in ggml-cann (llama/12733)
2025-04-08 hipuddingCANN: Refactor to reduce duplicate code (llama/12731)
2025-04-08 R0CKSTARmusa: fix compilation warnings in mp_22/31 (llama/12780)
2025-04-08 Jeff Bolzvulkan: fix NaN issue in flash attention shader (llama...
2025-04-08 Jeff Bolzvulkan: Use unclamped loads for flash attention mask...
2025-04-08 0cc4mVulkan: Tune Vulkan mmq int dot shader for performance...
2025-04-08 Nicolò Scipionesycl: allow ggml-sycl configuration and compilation...
2025-04-08 Ronny Brendelcmake: fix ggml-shaders-gen compiler paths containing...
2025-04-08 Jeff Bolzvulkan: Hybrid waitForFences/getFenceStatus to reduce...
2025-04-08 Jeff Bolzvulkan: set cmake minimum and project name in vulkan...
2025-04-08 Gaurav GargCUDA: Prefer vector flash decoding kernel for Gemma...
2025-04-08 Jeff Bolzvulkan: Fix missing cmake logic for dot product extensi...
2025-04-08 a3shfix MUSA compiler warning (llama/12704)
2025-04-08 Chenguang LiCANN: Support operator SIN COS ARGMAX (llama/12709)
2025-04-08 Alan GraySimplify and improve CUDA graphs through use of indirec...
2025-04-08 hipuddingCANN: Fix failed test cases (llama/12708)
2025-04-08 lhezopencl: use `max_alloc_size` in backend ctx instead...
2025-04-08 Jeff Bolzvulkan: Implement split_k for coopmat2 flash attention...
2025-04-08 bandoticmake: remove caching from vulkan coopmat checks (llama...
2025-04-08 Jeff Bolzvulkan: Implement grouped query attention in the coopma...
2025-04-08 0cc4mVulkan: Fix mmq int dot float cache size (llama/12722)
2025-04-08 Diego Devesallama : add option to override model tensor buffers...
2025-04-07 Georgi Gerganovggml : simplify Arm fp16 CPU logic (#1177)
2025-04-04 Sigbjørn SkjæretCUDA: don't convert BF16 weights to FP32 (#1174)
2025-04-03 Georgi Gerganovsync : whisper.cpp upstream/0.0.1898
2025-04-02 cmdr2cpu: move all the operators into a separate c++ file...
2025-04-02 Georgi Gerganovsync : llama.cpp
2025-04-02 Chenguang Liget_rows and dup optimization (llama/12671)
2025-04-02 Junil Kimopencl : fix memory allocation size (llama/12649)
2025-04-02 Georgi Gerganovmetal : use F32 prec in FA kernels (llama/12688)
2025-04-02 R0CKSTARFix clang warning in gguf_check_reserved_keys (llama...
next