]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: app option to compile without FlashAttention (llama/12025)
authorJohannes Gäßler <redacted>
Sat, 22 Feb 2025 19:44:34 +0000 (20:44 +0100)
committerGeorgi Gerganov <redacted>
Tue, 25 Feb 2025 11:33:09 +0000 (13:33 +0200)
commit26c0f4794b561cc11a2eb8edb2e758e83ac57332
tree5ed14ebf605c67c186ab40929dee2960a7046895
parentf41f57aa499e6298413db9a242fe88cc88382e15
CUDA: app option to compile without FlashAttention (llama/12025)
12 files changed:
CMakeLists.txt
src/ggml-cuda/CMakeLists.txt
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn-mma-f16.cuh
src/ggml-cuda/fattn-tile-f16.cu
src/ggml-cuda/fattn-tile-f32.cu
src/ggml-cuda/fattn-vec-f16.cuh
src/ggml-cuda/fattn-vec-f32.cuh
src/ggml-cuda/fattn-wmma-f16.cu
src/ggml-cuda/ggml-cuda.cu
src/ggml-hip/CMakeLists.txt
src/ggml-musa/CMakeLists.txt