<p style="font-size:small;">Content-Length: 5474 | <a href="http://clevelandohioweatherforecast.com//pFad.php?u=" style="font-size:small;">pFad</a> | <a href="https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md" style="font-size:small;">https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md</a></p>th: 5462 ![](doc/logo.png) # LLamaWorker LLamaWorker is a HTTP API server developed based on the [LLamaSharp](https://github.com/SciSharp/LLamaSharp?wt.mc_id=DT-MVP-5005195) project. It provides an OpenAI-compatible API, making it easy for developers to integrate Large Language Models (LLM) into their applications. English | [中文](README_CN.md) ## Features - **OpenAI API Compatible**: Offers an API similar to OpenAI's / Azure OpenAI, making migration and integration easy. - **Multi-Model Support**: Supports configuring and switching between different models to meet the needs of various scenarios. - **Streaming Response**: Supports streaming responses to improve the efficiency of processing large responses. - **Embedding Support**: Provides text embedding functionality with support for various embedding models. - **chat templates**: Provides some common chat templates. - **Auto-Release**: Supports automatic release of loaded models. - **Function Call**: Supports function calls. - **API Key Authentication**: Supports API Key authentication. - **Gradio UI Demo**: Provides a UI demo based on Gradio.NET. ## Use Vulkan Compiled Version A Vulkan backend compiled version is provided in the release, you can download the corresponding compiled version from [Releases](../../releases): - `LLamaWorker-Vulkan-win-x64.zip` - `LLamaWorker-Vulkan-linux-x64.zip` After downloading and unzipping, modify the configuration in the `appsettings.json` file, and you can run the software and start using it. > For other backends, you can also download the `Vulkan` version, go to [llama.cpp](https://github.com/ggerganov/llama.cpp/releases) to download the corresponding compiled version, and replace the relevant libraries. You can also compile the `llama.cpp` project yourself to get the required libraries. ## Function Call LLamaWorker supports function calls, and currently provides three templates in the configuration file, and has tested the function call effect of `Phi-3`, `Qwen2` and `Llama3.1`. Function calls are compatible with OpenAI's API, You can test it with the following JSON request: `POST /v1/chat/completions` ```json { "model": "default", "messages": [ { "role": "user", "content": "Where is the temperature high between Beijing and Shanghai?" } ], "tools": [ { "function": { "name": "GetWeatherPlugin-GetCurrentTemperature", "description": "Get the current temperature of the specified city。", "parameters": { "type": "object", "required": [ "city" ], "properties": { "city": { "type": "string", "description": "City Name" } } } }, "type": "function" }, { "function": { "name": "EmailPlugin-SendEmail", "description": "Send an email to the recipient.", "parameters": { "type": "object", "required": [ "recipientEmails", "subject", "body" ], "properties": { "recipientEmails": { "type": "string", "description": "A recipient email list separated by semicolons" }, "subject": { "type": "string" }, "body": { "type": "string" } } } }, "type": "function" } ], "tool_choice": "auto" } ``` ## Compile and Run 1. Clone the repository locally ```bash git clone https://github.com/sangyuxiaowu/LLamaWorker.git ``` 2. Enter the project directory ```bash cd LLamaWorker ``` 3. Choose the project file according to your needs. The project provides three versions of the project files: - `LLamaWorker.Backend.Cpu`: For CPU environments. - `LLamaWorker.Backend.Cuda11`: For GPU environments with CUDA 11. - `LLamaWorker.Backend.Cuda12`: For GPU environments with CUDA 12. - `LLamaWorker.Backend.Vulkan`: Vulkan. Select the project file that suits your environment for the next step. 4. Install dependencies ```bash dotnet restore LLamaWorker.Backend.Cpu\LLamaWorker.Backend.Cpu.csproj ``` If you are using a CUDA version, replace the project file name accordingly. 5. Modify the configuration file `appsettings.json`. The default configuration includes some common open-source model configurations, you only need to modify the model file path (`ModelPath`) as needed. 6. Start the server ```bash dotnet run --project LLamaWorker.Backend.Cpu\LLamaWorker.Backend.Cpu.csproj ``` If you are using a CUDA version, replace the project file name accordingly. ## API Reference LLamaWorker offers the following API endpoints: - `/v1/chat/completions`: Chat completion requests - `/v1/completions`: Prompt completion requests - `/v1/embeddings`: Create embeddings - `/models/info`: Returns basic information about the model - `/models/config`: Returns information about configured models - `/models/{modelId}/switch`: Switch to a specified model ## Gradio UI Demo This ui is based on [Gradio.NET](https://github.com/feiyun0112/Gradio.Net?wt.mc_id=DT-MVP-5005195). You can also try the Gradio UI demo by running the following command: ```bash dotnet restore ChatUI\ChatUI.csproj dotnet run --project ChatUI\ChatUI.csproj ``` Then open the browser and visit the Gradio UI demo. ![](doc/ui.png) <!-- URL input box at the bottom --> <form method="GET" action=""> <label for="targeturl-bottom"><b>Enter URL:</b></label> <input type="text" id="targeturl-bottom" name="u" value="https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md" required><br><small> <label for="useWeserv-bottom">Disable Weserv Image Reduction:</label> <input type="checkbox" id="useWeserv-bottom" name="useWeserv" value="false"><br> <label for="stripJS-bottom">Strip JavaScript:</label> <input type="checkbox" id="stripJS-bottom" name="stripJS" value="true"><br> <label for="stripImages-bottom">Strip Images:</label> <input type="checkbox" id="stripImages-bottom" name="stripImages" value="true"><br> <label for="stripFnts-bottom">Stripout Font Forcing:</label> <input type="checkbox" id="stripFnts-bottom" name="stripFnts" value="true"><br> <label for="stripCSS-bottom">Strip CSS:</label> <input type="checkbox" id="stripCSS-bottom" name="stripCSS" value="true"><br> <label for="stripVideos-bottom">Strip Videos:</label> <input type="checkbox" id="stripVideos-bottom" name="stripVideos" value="true"><br> <label for="removeMenus-bottom">Remove Headers and Menus:</label> <input type="checkbox" id="removeMenus-bottom" name="removeMenus" value="true"><br></small> <!-- New form elements Sandwich Strip --> <label for="start"><small>Remove from after:</label> <input type="text" id="start" name="start" value="<body>"> <label for="end"><small>to before:</label> <input type="text" id="end" name="end"> <input type="checkbox" id="applySandwichStrip" name="applySandwichStrip" value="1" onclick="submitForm()"> ApplySandwichStrip<br></small> <button type="submit">Fetch</button> </form><!-- Header banner at the bottom --> <p><h1><a href="http://clevelandohioweatherforecast.com//pFad.php?u=" title="pFad">pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! <i>Saves Data!</i></a></h1><br><em>--- a PPN by Garber Painting Akron. <b> With Image Size Reduction </b>included!</em></p><p>Fetched URL: <a href="https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md" target="_blank">https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md</a></p><p>Alternative Proxies:</p><p><a href="http://clevelandohioweatherforecast.com/php-proxy/index.php?q=https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md" target="_blank">Alternative Proxy</a></p><p><a href="http://clevelandohioweatherforecast.com/pFad/index.php?u=https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md&useWeserv=true" target="_blank">pFad Proxy</a></p><p><a href="http://clevelandohioweatherforecast.com/pFad/v3index.php?u=https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md&useWeserv=true" target="_blank">pFad v3 Proxy</a></p><p><a href="http://clevelandohioweatherforecast.com/pFad/v4index.php?u=https://github.com/sangyuxiaowu/LLamaWorker/raw/refs/heads/main/README.md&useWeserv=true" target="_blank">pFad v4 Proxy</a></p>