]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: revise q8_1 data layout for mul_mat_q (llama/7824)
authorJohannes Gäßler <redacted>
Sun, 9 Jun 2024 07:42:25 +0000 (09:42 +0200)
committerGeorgi Gerganov <redacted>
Sat, 15 Jun 2024 19:05:47 +0000 (22:05 +0300)
commit0610dddbc1679c86b5ff084870a414ba771b7e94
tree7bc336061959e54f391278be3621329069b77371
parenta7d88a6c15ce58f058c3453497e615d682d44064
CUDA: revise q8_1 data layout for mul_mat_q (llama/7824)
src/ggml-cuda.cu
src/ggml-cuda/mmq.cu
src/ggml-cuda/mmq.cuh
src/ggml-cuda/quantize.cu
src/ggml-cuda/quantize.cuh