2024-06-16 |
Georgi Gerganov | ggml : improve ggml_is_contiguous logic (llama/7856) |
commit | commitdiff | tree |
2024-06-16 |
k.h.lai | vulkan: select only one device for single gpu with... |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Update Vulkan RoPE implementation (llama/7818) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K)... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: use tensor cores for MMQ (llama/7676) |
commit | commitdiff | tree |
2024-06-16 |
Ben Ashbaugh | use the correct SYCL context for host USM allocations... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: revise q8_1 data layout for mul_mat_q (llama... |
commit | commitdiff | tree |
2024-06-16 |
slaren | vulkan : reuse parent extra for views (llama/7806) |
commit | commitdiff | tree |
2024-06-16 |
pengxin99 | fix softmax r2r result wrong issue (llama/7811) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: refactor mmq, dmmv, mmvq (llama/7716) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : refactor rope norm/neox (llama/7634) |
commit | commitdiff | tree |
2024-06-16 |
agray3 | Allow number of nodes in CUDA graph to change (llama... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : remove OpenCL (llama/7735) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : prevent builds with -ffinite-math-only (llama... |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | llama : offload to RPC in addition to other backends... |
commit | commitdiff | tree |
2024-06-16 |
Masaya, Kato | ggml : use OpenMP as a thread pool (llama/7606) |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Vulkan Mixture of Experts (MoE) support (llama/7628) |
commit | commitdiff | tree |
2024-06-16 |
woachk | kompute : implement op_getrows_f32 (llama/6403) |
commit | commitdiff | tree |
2024-06-16 |
Dave Airlie | fix bug introduced in using calloc (llama/7701) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | Fix FlashAttention debug test, FP32 assert (llama/7684) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: quantized KV support for FA vec (llama/7527) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : fix loongson compile warnings (llama/7537) |
commit | commitdiff | tree |
2024-06-16 |
Chris Elrod | faster avx512 exp implementation (llama/7551) |
commit | commitdiff | tree |
2024-06-16 |
junchao-loongson | ggml : fix loongarch build (O2 issue) (llama/7636) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | metal : remove invalid asserts (llama/7617) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | metal : add missing asserts (llama/7617) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : fix YARN + add tests + add asserts (llama/7617) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | cuda : non-cont concat support (llama/7610) |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | llama-bench : add support for the RPC backend (llama... |
commit | commitdiff | tree |
2024-06-16 |
slaren | ggml : use atomic_flag for critical section (llama... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | examples : adapt to new ggml_concat (ggml/0) |
commit | commitdiff | tree |
2024-06-16 |
zhouwg | ggml : fix typo in ggml.c (llama/7603) |
commit | commitdiff | tree |
2024-06-16 |
Meng, Hengyu | Align GEMM dispatch (llama/7566) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | sycl : fix assert (llama/7563) |
commit | commitdiff | tree |
2024-06-16 |
k.h.lai | vulkan: properly initialize vulkan devices for LLAMA_SP... |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | rpc : resource management rework (llama/7562) |
commit | commitdiff | tree |
2024-06-16 |
Neo Zhang | fix ggml_sycl_mul_mat_id() to match the change of api... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : generalize GGML_OP_CONCAT (llama/7563) |
commit | commitdiff | tree |
2024-06-16 |
Djip007 | update HIP_UMA #7399 (llama/7414) |
commit | commitdiff | tree |
2024-06-16 |
agray3 | Allow multiple copy function pointers for CUDA graph... |
commit | commitdiff | tree |
2024-06-16 |
AidanBeltonS | Fix q_xxs using mul_mat_q (llama/7459) |
commit | commitdiff | tree |
2024-06-16 |
AidanBeltonS | Add freq factors (llama/7495) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | metal : add GGML_OP_REPEAT kernels (llama/7557) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | metal : disable FA kernel for HS=256 (llama/7556) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : restore ggml_rope_xpos_inplace (ggml/0) |
commit | commitdiff | tree |
2024-06-16 |
Masaya, Kato | ggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : silence UB sanitizer error during iq2_xxs quanti... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : remove ggml_flash_attn and ggml_flash_ff (llama... |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | ggml : drop support for QK_K=64 (llama/7473) |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Update vulkan rope implementation to support frequency... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: fix FA out-of-bounds reads (llama/7479) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: fix FA out-of-bounds writes (llama/7465) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | cuda : fix compile warning (llama/7454) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: remove incorrect precision check (llama/7454) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | cuda : fix rope + add tests (llama/7452) |
commit | commitdiff | tree |
2024-06-16 |
liuwei-git | llama : add phi3 128K model support (llama/7225) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | metal : handle F16 inf values, fix FA partial offload... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: fix unused warning in mmq.cu (llama/7442) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: deduplicate mmq code (llama/7397) |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | rpc : track allocated buffers (llama/7411) |
commit | commitdiff | tree |
2024-06-16 |
AidanBeltonS | Update SYCL upscale operation (llama/7321) |
commit | commitdiff | tree |
2024-06-16 |
Herman Semenov | ggml-opencl, llama: using reserve() if count already... |
commit | commitdiff | tree |
2024-06-16 |
junchao-loongson | ggml : add loongarch lsx and lasx support (llama/6454) |
commit | commitdiff | tree |
2024-06-16 |
Srihari-mcw | Add provisions for windows support for BF16 code includ... |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Vulkan Embedding Fix (llama/7360) |
commit | commitdiff | tree |
2024-06-16 |
slaren | ggml : fix another case of quants nans (llama/7387) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | ggml: implement quantized KV cache for FA (llama/7372) |
commit | commitdiff | tree |
2024-06-16 |
slaren | cuda : clear error after buffer allocation failure... |
commit | commitdiff | tree |
2024-06-16 |
fraxy-v | Capture CUDA logging output (llama/7298) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | android : use "ci-android" branch for CI (llama/7341) |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: deduplicate FlashAttention code (llama/7352) |
commit | commitdiff | tree |
2024-06-16 |
Engininja2 | cuda : add half2 __shfl_xor() for ROCm 5.5 (llama/7263) |
commit | commitdiff | tree |
2024-06-16 |
0cc4m | Update and fix Vulkan soft_max and argsort implementati... |
commit | commitdiff | tree |
2024-06-16 |
slaren | ggml : fix quants nans when all the group weights are... |
commit | commitdiff | tree |
2024-06-16 |
Johannes Gäßler | CUDA: faster large batch FA without tensor cores (llama... |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | rpc : set SO_REUSEADDR for the server socket (llama... |
commit | commitdiff | tree |
2024-06-16 |
Herman Semenov | ggml-quants, llama : removed excess checks (llama/7274) |
commit | commitdiff | tree |
2024-06-16 |
Justine Tunney | ggml : rewrite silu and softmax for cpu (llama/7154) |
commit | commitdiff | tree |
2024-06-16 |
Radoslav Gerganov | rpc : add command line arg for specifying backend memory |
commit | commitdiff | tree |
2024-06-16 |
Max Krasnyansky | Add support for properly optimized Windows ARM64 builds... |
commit | commitdiff | tree |
2024-06-16 |
kunnis | ggml : use dynamic thread scheduling for matrix multipl... |
commit | commitdiff | tree |
2024-06-16 |
agray3 | Avoid unnecessarily disabling CUDA graphs (llama/7302) |
commit | commitdiff | tree |
2024-06-16 |
slaren | ggml : tag ggml_tensor::backend as deprecated (llama... |
commit | commitdiff | tree |
2024-06-16 |
AidanBeltonS | Add missing " (llama/7303) |
commit | commitdiff | tree |
2024-06-16 |
John Balis | ggml : add `ggml_upscale_ext` (ggml/814) |
commit | commitdiff | tree |
2024-06-16 |
Georgi Gerganov | scripts : update sync |
commit | commitdiff | tree |
2024-06-13 |
Borislav Stanimirov | whisper : use ggml-cuda in mel calc, set appropriate... |
commit | commitdiff | tree |
2024-06-11 |
Georgi Gerganov | cuda : fix HIPBLAS build (#2234) |
commit | commitdiff | tree |
2024-06-11 |
Georgi Gerganov | cuda : fix bounds check for src0 rows in MMVQ kernel... |
commit | commitdiff | tree |
2024-06-11 |
Georgi Gerganov | ci : fix CUDA builds (#2232) |
commit | commitdiff | tree |
2024-06-10 |
Borislav Stanimirov | whisper : auto-grow working areas for mel_calc_cuda... |
commit | commitdiff | tree |
2024-06-10 |
Georgi Gerganov | whisper : free whisper_mel instances (#2220) |
commit | commitdiff | tree |
2024-06-06 |
Georgi Gerganov | whisper : whisper_state/backend fixes (#2217) |
commit | commitdiff | tree |
2024-06-06 |
Borislav Stanimirov | whisper : calculate mel spectrogram directly into a... |
commit | commitdiff | tree |
2024-06-04 |
Borislav Stanimirov | whisper : add CUDA-specific computation mel spectrogram... |
commit | commitdiff | tree |
2024-05-31 |
Borislav Stanimirov | whisper : remove `speed_up` and `phase_vocoder*` functi... |
commit | commitdiff | tree |
2024-05-30 |
Martin Delille | readme : add conan badge (#2196) |
commit | commitdiff | tree |
2024-05-30 |
Carlos Zoido | readme : add install instructions for Conan (#2189) |
commit | commitdiff | tree |
2024-05-29 |
Borislav Stanimirov | whisper: use global cache for sin/cos vals and Hann... |
commit | commitdiff | tree |
next |