git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Shelby Jenkins <redacted>
	Tue, 4 Feb 2025 11:20:55 +0000 (05:20 -0600)
committer	GitHub <redacted>
	Tue, 4 Feb 2025 11:20:55 +0000 (13:20 +0200)
commit	106045e7bb8db481bb2ebbc60e3b53cb27ada117
tree	cbdfe19a3f9e45854c937ffa17c246f51940bc2f	tree
parent	f117d84b48104992ba16961b35a96fa93efbb355	commit \| diff

readme : add llm_client Rust crate to readme bindings (#11628)

[This crate](https://github.com/ShelbyJenkins/llm_client) has been in a usable state for quite awhile, so I figured now is fair to add it.

It installs from crates.io, and automatically downloads the llama.cpp repo and builds it for the target platform - with the goal being the easiest user experience possible.

It also integrates model presets and choosing the largest quant given the target's available VRAM. So a user just has to specify one of the presets (I manually add the most popular models), and it will download from hugging face.

So, it's like a Rust Ollama, but it's not really for chatting. It makes heavy use of llama.cpp's grammar system to do structured output for decision making and control flow tasks.