]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: app option to compile without FlashAttention (#12025)
authorJohannes Gäßler <redacted>
Sat, 22 Feb 2025 19:44:34 +0000 (20:44 +0100)
committerGitHub <redacted>
Sat, 22 Feb 2025 19:44:34 +0000 (20:44 +0100)
commita28e0d5eb18c18e6a4598286158f427269b1444e
treebd5f40fd067d3dc89178dc714e643c1371f80353
parent36c258ee921dbb5c96bdc57c0872e4a9a129bef6
CUDA: app option to compile without FlashAttention (#12025)
13 files changed:
Makefile
ggml/CMakeLists.txt
ggml/src/ggml-cuda/CMakeLists.txt
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh
ggml/src/ggml-cuda/fattn-tile-f16.cu
ggml/src/ggml-cuda/fattn-tile-f32.cu
ggml/src/ggml-cuda/fattn-vec-f16.cuh
ggml/src/ggml-cuda/fattn-vec-f32.cuh
ggml/src/ggml-cuda/fattn-wmma-f16.cu
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-hip/CMakeLists.txt
ggml/src/ggml-musa/CMakeLists.txt