git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

CUDA: optimize and refactor MMQ (llama/8416)

* CUDA: optimize and refactor MMQ

* explicit q8_1 memory layouts, add documentation

Packaging of ggerganov/whisper.cpp

ggml/src/ggml-cuda/mma.cuh		diff \| blob \| history
ggml/src/ggml-cuda/mmq.cuh		diff \| blob \| history
ggml/src/ggml-cuda/quantize.cu		diff \| blob \| history
ggml/src/ggml-cuda/quantize.cuh		diff \| blob \| history
ggml/src/ggml-cuda/vecdotq.cuh		diff \| blob \| history