- New code should follow the guidelines (coding, naming, etc.) outlined in this document. Exceptions are allowed in isolated, backend-specific parts of the code that do not interface directly with the `ggml` interfaces.
_(NOTE: for legacy reasons, existing code is not required to follow this guideline)_
+- For changes in server, please make sure to refer to the [server development documentation](./tools/server/README-dev.md)
+
# Documentation
- Documentation is a community effort
If you are an end user consuming `llama-server` as a product, please refer to the main [README](./README.md) instead.
+## Scope of features
+
+In-scope types of feature:
+
+- Backend:
+ - Basic inference features: text completion, embeddings output
+ - Chat-oriented features: chat completion, tool calling
+ - Third-party API compatibility, e.g. OAI-compat, Anthropic-compat
+ - Multimodal input/output
+ - Memory management: save/load state, context checkpoints
+ - Model management
+ - Features that are required by the Web UI
+- Frontend:
+ - Chat-oriented features, example: basic chat, image upload, edit messages
+ - Agentic features, example: MCP
+ - Model management
+
+Note: For security reasons, features that require reading or writing external files must be **disabled by default**. This covers features like: MCP, model save/load
+
+Out-of-scope features:
+
+- Backend:
+ - Features that require a loop of external API calls, e.g. server-side agentic loop. This is because external API calls in C++ are costly to maintain. Any complex third-party logic should be implemented outside of server code.
+ - Features that expose the internal state of the model to the API, example: getting the intermediate activation from API. This is because llama.cpp doesn't support a stable API for doing this, and relying on `eval_callback` can make it complicated to maintain as this API is not intended to be used in multi-sequence setup.
+ - Model-specific features. All API calls and features must remain model-agnostic.
+- Frontend:
+ - Third-party plugins, it is costly to maintain a public plugin API for such features. Instead, users can make their own MCP server for their needs.
+ - Customizable themes, it is also costly to maintain. While we do focus on the aesthetic, we try to achieve this by perfecting a small set of themes.
+ - Browser-specific features, example: [Chrome's built-in AI API](https://developer.chrome.com/docs/ai/built-in-apis).
+
## Backend
### Overview