git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Ivy233 <redacted>
	Wed, 26 Mar 2025 14:06:04 +0000 (22:06 +0800)
committer	GitHub <redacted>
	Wed, 26 Mar 2025 14:06:04 +0000 (15:06 +0100)
commit	02082f1519565fc7b49de211b28bc5404a69209b
tree	d4f7b7240dfdb6ec41ac8b754327286b26650ae5	tree
parent	df4d20cd53d5bb6fc21c1dc65f026d53b566d097	commit \| diff

clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend (#12566)

* [Fix] Compiling clip-quantize-cli and running it in a CUDA environment will cause ggml_fp16_to_fp32 to report an error when trying to access video memory. You need to switch to the CPU backend to run quantize.
After the fix, it will automatically run in the CPU backend and will no longer be bound to CUDA.

* [Fix]Roll back the signature and implementation of clip_model_load, and change the call in clip_model_quantize to clip_init.