]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml : full ALiBi support (#7192)
authorGeorgi Gerganov <redacted>
Sat, 11 May 2024 07:32:41 +0000 (10:32 +0300)
committerGitHub <redacted>
Sat, 11 May 2024 07:32:41 +0000 (10:32 +0300)
commit9cb317f77e53067f7a138cc89ef7657148eae8e6
tree3ba1d2d80d1d7c8b4ab01f6396a3febaae26e91b
parente849648888a11de13aaaa4cb2eda3f5a9c7b444d
ggml : full ALiBi support (#7192)

* ggml : full ALiBi support

* ggml : update ggml_soft_max_ext() CUDA, SYCL

* ggml : ggml_flash_attn_ext() support ALiBi (CPU)

* ggml : ggml_flash_attn_ext() support ALiBi (Metal)

* ggml : fix warning

* ggml : ggml_flash_attn_ext() support ALiBi (CUDA)

ggml-ci

* ggml : fix assert message

* vulkan : add dev notes

* ggml : require mask when using ALiBi

ggml-ci

* convert : fix convert for refact models
16 files changed:
convert-hf-to-gguf.py
ggml-cuda.cu
ggml-cuda/alibi.cu [deleted file]
ggml-cuda/alibi.cuh [deleted file]
ggml-cuda/fattn.cu
ggml-cuda/softmax.cu
ggml-kompute.cpp
ggml-metal.m
ggml-metal.metal
ggml-sycl.cpp
ggml-vulkan.cpp
ggml.c
ggml.h
gguf-py/gguf/tensor_mapping.py
llama.cpp
tests/test-backend-ops.cpp