git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Bowen Han <redacted>
	Thu, 18 Sep 2025 18:26:03 +0000 (11:26 -0700)
committer	GitHub <redacted>
	Thu, 18 Sep 2025 18:26:03 +0000 (20:26 +0200)
commit	38dbdf4c057515ccea9bec0ca2518f86d5e4d28e
tree	79fd2e85a4b45ae848ede5d3a1d68edec787b111	tree
parent	368560a1e3b9a3bc83af741b0b2bc9e46fb420d2	commit \| diff

CUDA: Optimize PAD_REFLECT_1D (#15957)

* CUDA: Optimize PAD_REFLECT_1D
feat: add more test cases for PAD_REFLECT_1D

* use fast_div to improve performance

* Apply suggestion from JohannesGaessler

Co-authored-by: Johannes Gäßler <redacted>
* Apply suggestion from JohannesGaessler

Co-authored-by: Johannes Gäßler <redacted>
* optimize

* use a concise expression to further speedup the cuda kernel

---------

Co-authored-by: Johannes Gäßler <redacted>

ggml/src/ggml-cuda/common.cuh		diff \| blob \| history
ggml/src/ggml-cuda/pad_reflect_1d.cu		diff \| blob \| history
tests/test-backend-ops.cpp		diff \| blob \| history

Packaging of ggml-org/llama.cpp

RSS Atom