git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Xuan Son Nguyen <redacted>
	Thu, 2 Jan 2025 14:05:18 +0000 (15:05 +0100)
committer	GitHub <redacted>
	Thu, 2 Jan 2025 14:05:18 +0000 (15:05 +0100)
commit	0da5d860266c6928b8c9408efbd264ae59fedda6
tree	edd1e4e9d3897381ba9b006480a1d9f525ccbf98	tree
parent	a45433ba209ee0b33d02c7dc4c31f29894ad83a6	commit \| diff

server : allow using LoRA adapters per-request (#10994)

* slot.can_batch_with

* lora per request

* test: force disable cache prompt

* move can_batch_with check

* fix condition

* add slow test with llama 8b

* update docs

* move lora change task to queue

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <redacted>
* lora_base

* remove redundant check

---------

Co-authored-by: Georgi Gerganov <redacted>

Packaging of ggml-org/llama.cpp

RSS Atom

examples/server/README.md		diff \| blob \| history
examples/server/server.cpp		diff \| blob \| history
examples/server/tests/README.md		diff \| blob \| history
examples/server/tests/requirements.txt		diff \| blob \| history
examples/server/tests/unit/test_lora.py		diff \| blob \| history
examples/server/tests/unit/test_speculative.py		diff \| blob \| history
examples/server/tests/utils.py		diff \| blob \| history
examples/server/utils.hpp		diff \| blob \| history