triton-inference-server

Here are 103 public repositories matching this topic...

NVIDIA / GenerativeAIExamples

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

microservice gpu-acceleration nemo tensorrt rag triton-inference-server large-language-models llm llm-inference retrieval-augmented-generation

Updated May 20, 2025
Jupyter Notebook

CoinCheung / BiSeNet

Star

Add bisenetv2. My implementation of BiSeNet

pytorch cityscapes tensorrt ncnn ade20k cocostuff openvino bisenet triton-inference-server

Updated Dec 19, 2024
Python

isarsoft / yolov4-triton-tensorrt

Star

This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server

docker deep-learning object-detection tensorrt yolov4 triton-inference-server yolov4-tiny

Updated Jun 2, 2022
C++

npuichigo / openai_trtllm

Star

OpenAI compatible API for TensorRT LLM triton backend

triton-inference-server openai-api llm langchain tensorrt-llm

Updated Aug 1, 2024
Rust

Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, offering scalability, extensibility, and high performance. It helps users quickly deploy models and provide services through HTTP/RPC interfaces.

tensorflow torch tensorrt serving triton-inference-server dynamic-batching vllm tensorrt-llm

Updated May 8, 2025
C++

torchpipe / torchpipe

Star

Serving Inside Pytorch

deployment inference pytorch ray serve tensorrt serving pipeline-parallelism torch2trt triton-inference-server llm-serving

Updated May 8, 2025
C++

clearml / clearml-serving

Star

ClearML - Model-Serving Orchestration and Repository Solution

kubernetes devops machine-learning ai deep-learning triton tensorflow-serving model-serving serving mlops serving-pytorch-models triton-inference-server clearml serving-ml

Updated Jan 13, 2025
Python

triton-inference-server / onnxruntime_backend

Star

The Triton backend for the ONNX Runtime.

backend inference triton-inference-server onnx-runtime

Updated May 14, 2025
C++

kamalkraj / stable-diffusion-tritonserver

Star

Deploy stable diffusion model with onnx/tenorrt + tritonserver

docker machine-learning deploy transformers inference python3 pytorch nvidia fp16 tensorrt onnx triton-inference-server tensorrt-inference stablediffusion

Updated Aug 15, 2023
Jupyter Notebook

NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference

Star

NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

ai deep-learning gpu dnn ros nvidia triton deeplearning tao jetson ros2 tensorrt triton-inference-server tensorrt-inference ros2-humble

Updated Feb 28, 2025
C++

notAI-tech / fastDeploy

Star

Deploy DL/ ML inference pipelines with minimal extra code.

Updated Nov 20, 2024
Python

Koldim2001 / TrafficAnalyzer

Star

Анализ трафика на круговом движении с использованием компьютерного зрения

Updated Mar 6, 2025
Python

trinhtuanvubk / Diff-VC

Star

Diffusion Model for Voice Conversion

gradio voice-conversion diffusion-models triton-inference-server

Updated Mar 14, 2024
Jupyter Notebook

bug-developer021 / YOLOV5_optimization_on_triton

Star

Compare multiple optimization methods on triton to imporve model service performance

gpu inference tensorrt triton-inference-server yolov5

Updated Jan 10, 2024
Jupyter Notebook

akiragy / recsys_pipeline

Star

Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.

python redis flask elasticsearch retrieval pytorch ranking inverted-index recommender-system recommendation feast vector-database triton-inference-server

Updated Sep 2, 2023
Python

rtzr / tritony

Star

Tiny configuration for Triton Inference Server

inference mlops triton-inference-server tritonclient

Updated Jan 10, 2025
Python

chiehpower / Setup-deeplearning-tools

Star

Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.

Updated Sep 27, 2023
Python

omarabid59 / yolov8-triton

Star

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

deployment triton-inference-server ultralytics triton-server yolov8

Updated Oct 19, 2023
Python

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Star

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

inference pytorch text-detection nvidia-docker inference-server tensorrt inference-engine onnx onnx-torch tensorrt-conversion triton-inference-server text-detection-from-image

Updated Aug 18, 2021
Python

Bobo-y / triton_ensemble_model_demo

Star

triton server ensemble model demo

pipeline triton-inference-server

Updated May 2, 2022
Python

Improve this page

Add a description, image, and links to the triton-inference-server topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the triton-inference-server topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

triton-inference-server

Here are 103 public repositories matching this topic...

NVIDIA / GenerativeAIExamples

CoinCheung / BiSeNet

isarsoft / yolov4-triton-tensorrt

npuichigo / openai_trtllm

NetEase-Media / grps

torchpipe / torchpipe

clearml / clearml-serving

triton-inference-server / onnxruntime_backend

kamalkraj / stable-diffusion-tritonserver

NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference

notAI-tech / fastDeploy

Koldim2001 / TrafficAnalyzer

trinhtuanvubk / Diff-VC

bug-developer021 / YOLOV5_optimization_on_triton

akiragy / recsys_pipeline

rtzr / tritony

chiehpower / Setup-deeplearning-tools

omarabid59 / yolov8-triton

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Bobo-y / triton_ensemble_model_demo

Improve this page

Add this topic to your repo

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.