Skip to main content

Predator

Predator is a scalable, high-performance model inference service built as a wrapper around NVIDIA Triton Inference Server, designed to serve ML models with low latency in Kubernetes, with OnFS and Interflow integration.