Skip to content

GetSoloTech/solo-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Solo Server

Solovision Logo

Python 3.9+ License: MIT PyPI - Downloads PyPI - Version

Solo Server is a lightweight and performant server for Physical AI inference.

# Install the solo-server package using pip
pip install solo-server

# Run the solo server setup in simple mode
solo setup
soloreccomp.mp4

Features

  • Seamless Setup: Manage your on device AI with a simple CLI and HTTP servers
  • Open Model Registry: Pull models from registries like Ollama & Hugging Face
  • Cross-Platform Compatibility: Deploy AI models effortlessly on your hardware
  • Configurable Framework: Auto-detect hardware (CPU, GPU, RAM) and sets configs

Table of Contents

Installation

๐Ÿ”นPrerequisites

๐Ÿ”น Install Solo Server

# Install Solo-Server
pip install solo-server

Run the interactive setup to configure Solo Server:

# Setup Solo-Server
solo setup

๐Ÿ”น Setup Features

โœ”๏ธ Detects CPU, GPU, RAM for hardware-optimized execution
โœ”๏ธ Auto-configures solo.conf with optimal settings
โœ”๏ธ Recommends the compute backend OCI (CUDA, HIP, SYCL, Vulkan, CPU, Metal)


โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ System Information โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Operating System: Windows                              โ”‚
โ”‚ CPU: AMD64 Family 23 Model 96 Stepping 1, AuthenticAMD โ”‚
โ”‚ CPU Cores: 8                                           โ”‚
โ”‚ Memory: 15.42GB                                        โ”‚
โ”‚ GPU: NVIDIA                                            โ”‚
โ”‚ GPU Model: NVIDIA GeForce GTX 1660 Ti                  โ”‚
โ”‚ GPU Memory: 6144.0GB                                   โ”‚
โ”‚ Compute Backend: CUDA                                  โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

๐Ÿ–ฅ๏ธ  Detected GPU: NVIDIA GeForce GTX 1660 Ti (NVIDIA)
โœ… NVIDIA GPU drivers and toolkit are correctly installed.
Would you like to use GPU for inference? [y/n] (y): y

๐Ÿข Choose the domain that best describes your field:
  1. Personal
  2. Education
  3. Agriculture
  4. Software
  5. Healthcare
  6. Forensics
  7. Robotics
  8. Enterprise
  9. Custom
Enter the number of your domain (1):

Commands

โ•ญโ”€ Commands โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ setup      Set up Solo Server environment with interactive prompts and saves configuration to config.json.                     โ”‚
โ”‚ serve      Start a model server with the specified model.                                                                      โ”‚
โ”‚ status     Check running models, system status, and configuration.                                                             โ”‚
โ”‚ list       List all downloaded models available in HuggingFace cache and Ollama.                                               โ”‚
โ”‚ test       Test if the Solo server is running correctly. Performs an inference test to verify server functionality.            โ”‚
โ”‚ stop       Stops Solo Server services. If a server type is specified (e.g., 'ollama', 'vllm', 'llama.cpp'), only that specific โ”‚
โ”‚            service will be stopped. Otherwise, all Solo services will be stopped.                                              โ”‚
โ”‚ download   Downloads a Hugging Face model using the huggingface repo id.                                                       โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Serve a Model

solo serve -s ollama -m llama3.2
โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --model   -m      TEXT     Model name or path. Can be: - HuggingFace repo ID (e.g., 'meta-llama/Llama-3.2-1B-Instruct') -      โ”‚
โ”‚                            Ollama model Registry (e.g., 'llama3.2') - Local path to a model file (e.g., '/path/to/model.gguf') โ”‚
โ”‚                            If not specified, the default model from configuration will be used.                                โ”‚
โ”‚                            [default: None]                                                                                     โ”‚
โ”‚ --server  -s      TEXT     Server type (ollama, vllm, llama.cpp) [default: None]                                               โ”‚
โ”‚ --port    -p      INTEGER  Port to run the server on [default: None]                                                           โ”‚
โ”‚ --ui                       Start the UI for the server [default: True]                                                         โ”‚
โ”‚ --help                     Show this message and exit.                                                                         โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

List Available Models

View all downloaded models in your HuggingFace cache and Ollama:

solo list

Stop Solo Server

solo stop 

REST API

Solo Server provides consistent REST API endpoints across different server types (Ollama, vLLM, llama.cpp). The exact API endpoint and format differs slightly depending on which server type you're using.

API Endpoints by Server Type

Ollama API

# Generate a response
curl http://localhost:5070/api/generate -d '{
  "model": "llama3.2",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

# Chat with a model
curl http://localhost:5070/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }
  ]
}'

vLLM and llama.cpp API

Both use OpenAI-compatible endpoints:

# Chat completion
curl http://localhost:5070/v1/chat/completions -d '{
  "model": "llama3.2",
  "messages": [
    { "role": "user", "content": "Why is the sky blue?" }
  ],
  "max_tokens": 50,
  "temperature": 0.7
}'

# Text completion
curl http://localhost:5070/v1/completions -d '{
  "model": "llama3.2",
  "prompt": "Why is the sky blue?",
  "max_tokens": 50,
  "temperature": 0.7
}'

๐Ÿ“ Contributions

Refer example_apps for sample applications.

  1. ai-chat

๐Ÿ”น To Contribute, Setup in Dev Mode

# Clone the repository
git clone https://github.com/GetSoloTech/solo-server.git

# Navigate to the directory
cd solo-server

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # On Unix/MacOS
# OR
.venv\Scripts\activate     # On Windows

# Install in editable mode
pip install -e .

๐Ÿ“ Project Inspiration

This project wouldn't be possible without the help of other projects like:

  • uv
  • llama.cpp
  • ramalama
  • ollama
  • whisper.cpp
  • vllm
  • podman
  • huggingface
  • aiaio
  • llamafile
  • cog

Like using Solo, consider leaving us a โญ on GitHub

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy