]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014)
authorJohannes Gäßler <redacted>
Tue, 22 Apr 2025 19:27:40 +0000 (21:27 +0200)
committerGeorgi Gerganov <redacted>
Thu, 24 Apr 2025 17:39:16 +0000 (20:39 +0300)
commit3d54b68ea7f6346a9f4d4f26047ace5064181920
treede8f259b6cdf0f47a5faee4e545a62b21190c993
parent11218294db0811fd39270af627de6a8c51186467
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014)

* CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID

* fix logic for RoPE support, CUDA graphs
ggml/src/ggml-cuda/ggml-cuda.cu
ggml/src/ggml-cuda/mmv.cu
ggml/src/ggml-cuda/mmv.cuh
ggml/src/ggml-cuda/mmvq.cu
ggml/src/ggml-cuda/mmvq.cuh
ggml/src/ggml-cuda/quantize.cu
ggml/src/ggml-cuda/quantize.cuh
ggml/src/ggml-cuda/vecdotq.cuh