2024-08-08 |
Nicholai Tukanov | ggml : add NVPL BLAS support (ggml/8329) (llama/8425) |
commit | commitdiff | tree |
2024-08-08 |
Daniel Bevenius | cuda : suppress 'noreturn' warn in no_device_code ... |
commit | commitdiff | tree |
2024-08-08 |
Johannes Gäßler | CUDA: optimize and refactor MMQ (llama/8416) |
commit | commitdiff | tree |
2024-08-08 |
AidanBeltonS | Use multi_ptr to clean up deprecated warnings (llama... |
commit | commitdiff | tree |
2024-08-08 |
Georgi Gerganov | ggml : move sgemm sources to llamafile subfolder (llama... |
commit | commitdiff | tree |
2024-08-08 |
Dibakar Gope | ggml : add AArch64 optimized GEMV and GEMM Q4 kernels... |
commit | commitdiff | tree |
2024-08-08 |
Alberto Cabrera... | sycl : Reenabled mmvq path for the SYCL Nvidia Backend... |
commit | commitdiff | tree |
2024-08-08 |
Alberto Cabrera... | sycl : fix powf call in device code (llama/8368) |
commit | commitdiff | tree |
2024-08-08 |
Mahesh Madhav | ggml : loop tiling optimizations for scalar path (ggml... |
commit | commitdiff | tree |
2024-08-08 |
Ivan Filipov | ggml: add support for float16 input tensors in pooling... |
commit | commitdiff | tree |
2024-08-08 |
Tony Wasserka | vulkan : initialize vk_buffer_struct members to VK_NULL... |
commit | commitdiff | tree |
2024-08-08 |
Borislav Stanimirov | cmake : only enable GGML_NATIVE and x86 flags if not... |
commit | commitdiff | tree |
2024-08-08 |
Georgi Gerganov | scripts : sync new files (#0) |
commit | commitdiff | tree |
2024-08-05 |
Daven Sanassy | cmake : fix compile in xcode (#2311) |
commit | commitdiff | tree |
2024-07-27 |
Georgi Gerganov | whisper : handle empty mel (#2324) |
commit | commitdiff | tree |
2024-07-16 |
Matt Stephenson | whisper : use vulkan as gpu backend when available... |
commit | commitdiff | tree |
2024-07-15 |
arizhih | whisper : fix DTW assert (#2299) |
commit | commitdiff | tree |
2024-07-09 |
Georgi Gerganov | cmake : use WHISPER_EXTRA_FLAGS (#2294) |
commit | commitdiff | tree |
2024-07-09 |
Borislav Stanimirov | cmake : allow external ggml |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | cmake : try to fix openvino build (#2281) |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | cmake : remove install of llama convert script [no... |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | make : remove llama prints [no ci] (#2265) |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | examples : fix compile warnings [no ci] (#0) |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | ggml : sync sycl (skip) (#0) |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | scripts : fix sync scripts |
commit | commitdiff | tree |
2024-07-08 |
Daniel Bevenius | ggml : remove unnecessary UNUSED macro call (ggml/880) |
commit | commitdiff | tree |
2024-07-08 |
Natsu | cmake : add GGML_BUILD and GGML_SHARED macro definition... |
commit | commitdiff | tree |
2024-07-08 |
Ouadie EL FAROUKI | Enabled more data types for oneMKL gemm_batch (llama... |
commit | commitdiff | tree |
2024-07-08 |
Johannes Gäßler | CUDA: MMQ support for iq4_nl, iq4_xs (llama/8278) |
commit | commitdiff | tree |
2024-07-08 |
Daniele | CUDA: revert part of the RDNA1 optimizations (llama... |
commit | commitdiff | tree |
2024-07-08 |
Johannes Gäßler | CUDA: fix MMQ stream-k rounding if ne00 % 128 != 0... |
commit | commitdiff | tree |
2024-07-08 |
luoyu-intel | Fix WARP_SIZE=16 bug of Intel GPU (llama/8266) |
commit | commitdiff | tree |
2024-07-08 |
Neo Zhang Jianyu | rm get_work_group_size() by local cache for performance... |
commit | commitdiff | tree |
2024-07-08 |
Daniele | Define and optimize RDNA1 (llama/8085) |
commit | commitdiff | tree |
2024-07-08 |
Judd | fix typo (llama/8267) |
commit | commitdiff | tree |
2024-07-08 |
Clint Herron | Removes multiple newlines at the end of files that... |
commit | commitdiff | tree |
2024-07-08 |
slaren | cuda : update supports_op for matrix multiplication... |
commit | commitdiff | tree |
2024-07-08 |
luoyu-intel | Fix win build conflict of math library (llama/8230) |
commit | commitdiff | tree |
2024-07-08 |
luoyu-intel | Fix the sub group size of Intel (llama/8106) |
commit | commitdiff | tree |
2024-07-08 |
Johannes Gäßler | CUDA: refactor and optimize IQ MMVQ (llama/8215) |
commit | commitdiff | tree |
2024-07-08 |
zhentaoyu | Update SYCL-Rope op and Refactor (llama/8157) |
commit | commitdiff | tree |
2024-07-08 |
Johannes Gäßler | CUDA: fix MMQ stream-k for --split-mode row (llama... |
commit | commitdiff | tree |
2024-07-08 |
John Balis | feat: cuda implementation for `ggml_conv_transpose_1d... |
commit | commitdiff | tree |
2024-07-08 |
Georgi Gerganov | ci : disable java build |
commit | commitdiff | tree |
2024-07-08 |
Emmanuel Schmidbauer | server : add inference path to make OAI API compatible... |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | sync : ggml + fix sync script |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | make : disable CUDA graphs |
commit | commitdiff | tree |
2024-06-26 |
slaren | ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CU... |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | make : disable CUDA mel build |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | cmake : minor fixes |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | make : fix missing -O3 |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | whisper : disable CUDA mel + fix FFMPEG |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-06-26 |
Georgi Gerganov | whisper : reorganize source code + improve CMake (... |
commit | commitdiff | tree |
2024-06-18 |
mky_coder | whisper : optimize fft() function (#2242) |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | whisper : use ggml_backend_sched (#2239) |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | fix : remove extra files |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | scripts : sync ggml-blas |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | build : update make / cmake |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-06-18 |
slaren | move BLAS to a separate backend (cont) (llama/6210) |
commit | commitdiff | tree |
2024-06-18 |
0cc4m | Vulkan Shader Refactor, Memory Debugging Option (llama... |
commit | commitdiff | tree |
2024-06-18 |
Georgi Gerganov | scripts : stop sync whisper example from ggml |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | cmake : fix sycl build (#0) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : remove OpenCL (#0) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | sycl : sync (#0) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | cuda : enable CUDA graphs (#0) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | talk-llama : sync llama.cpp |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | cmake : fix CUDA build (#0) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | sync : ggml |
commit | commitdiff | tree |
2024-06-16 |
Hong Bo PENG | ggml : fix and optimize ppc64le (ggml/849) |
commit | commitdiff | tree |
2024-06-16 |
Daniel Bevenius | ggml : remove duplicate include of ggml-common.h (ggml... |
commit | commitdiff | tree |
2024-06-16 |
Meng, Hengyu | remove global variables (llama/7710) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (llama... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | metal : utilize max shared memory for mul_mat_id (llama... |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | rpc : fix ggml_backend_rpc_supports_buft() (llama/7918) |
commit | commitdiff | tree |
2024-06-16 |
slaren | move BLAS to a separate backend (llama/6210) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: fix broken oob check for FA vec f32 kernel (llama... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | tests : add non-cont unary tests (llama/7857) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : improve ggml_is_contiguous logic (llama/7856) |
commit | commitdiff | tree |
2024-06-16 |
k.h.lai | vulkan: select only one device for single gpu with... |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Update Vulkan RoPE implementation (llama/7818) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: use tensor cores for MMQ (llama/7676) |
commit | commitdiff | tree |
2024-06-16 |
Ben Ashbaugh | use the correct SYCL context for host USM allocations... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: revise q8_1 data layout for mul_mat_q (llama... |
commit | commitdiff | tree |
2024-06-16 |
slaren | vulkan : reuse parent extra for views (llama/7806) |
commit | commitdiff | tree |
2024-06-16 |
pengxin99 | fix softmax r2r result wrong issue (llama/7811) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: refactor mmq, dmmv, mmvq (llama/7716) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : refactor rope norm/neox (llama/7634) |
commit | commitdiff | tree |
2024-06-16 |
agray3 | Allow number of nodes in CUDA graph to change (llama... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : remove OpenCL (llama/7735) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : prevent builds with -ffinite-math-only (llama... |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | llama : offload to RPC in addition to other backends... |
commit | commitdiff | tree |
2024-06-16 |
Masaya, Kato | ggml : use OpenMP as a thread pool (llama/7606) |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Vulkan Mixture of Experts (MoE) support (llama/7628) |
commit | commitdiff | tree |
2024-06-16 |
woachk | kompute : implement op_getrows_f32 (llama/6403) |
commit | commitdiff | tree |
next |