git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

ggml: CUDA: add head size 72 for flash-attn (#16962)

ggml/src/ggml-cuda/fattn-tile.cu		diff \| blob \| history
ggml/src/ggml-cuda/fattn-tile.cuh		diff \| blob \| history
ggml/src/ggml-cuda/fattn.cu		diff \| blob \| history
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq72-dv72.cu	[new file with mode: 0644]	blob
ggml/src/ggml-cuda/template-instances/generate_cu_files.py		diff \| blob \| history

Packaging of ggml-org/llama.cpp