Efficiency
Share resources between Slurm and Kubernetes to dramatically increase resource efficiency and streamline deployment—perfect for running training and inference workloads on the same cluster
Dynamic scalability
SUNK can dynamically scale Slurm nodes to match workload requirements, easing the burden of managing the needs of complex and large-scale compute tasks
Optimization
Run only one cluster—instead of two separate ones for training and inference. Get the benefits of optimal model training throughput and capacity to support production inference demand, accelerating time to market