git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

CUDA: batched+noncont MMQ, refactor bs>1 MoE code (#13199)

Packaging of ggml-org/llama.cpp

ggml/src/ggml-cuda/getrows.cu		diff \| blob \| history
ggml/src/ggml-cuda/getrows.cuh		diff \| blob \| history
ggml/src/ggml-cuda/ggml-cuda.cu		diff \| blob \| history
ggml/src/ggml-cuda/mmq.cu		diff \| blob \| history
ggml/src/ggml-cuda/mmq.cuh		diff \| blob \| history
ggml/src/ggml-cuda/mmvq.cu		diff \| blob \| history
ggml/src/ggml-cuda/quantize.cu		diff \| blob \| history
ggml/src/ggml-cuda/quantize.cuh		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history