Senior MLOps Engineer

2 months agoSenior Anywhere Remote Full Time Ai Jobs by FortyTwo

Skills

About the Role

You will deploy and maintain production ML infrastructure, optimize GPU utilization, and serve large and small language models. You will build CI/CD pipelines, create Helm templates for Kubernetes deployments, implement model optimization and serving workflows, and set up monitoring, logging, and automated workflows to ensure reliable model delivery.

Requirements

Bachelor's or Master's degree in Computer Science Engineering or related field
Proficiency in Kubernetes Helm and containerization technologies
Experience with GPU optimization including MIG and NOS
Experience with cloud platforms such as AWS GCP and Azure
Knowledge of monitoring tools such as Grafana and Prometheus
Proficiency in scripting languages Python and Bash
Hands-on experience with CI/CD tools and workflow management systems
Familiarity with Triton Inference Server ONNX and TensorRT

Responsibilities

Deploy scalable production-ready ML services with optimized infrastructure
Manage and autoscale Kubernetes clusters
Optimize GPU resources using MIG and NOS
Manage cloud storage to ensure high availability and performance
Integrate LoRA and model merging workflows
Adapt and deploy state-of-the-art ML codebases
Deploy and manage LLMs SLMs and LMMs
Serve models using Triton Inference Server and other serving frameworks
Leverage vLLM and TGI for model serving
Optimize models with ONNX and TensorRT
Develop Retrieval-Augmented Generation systems
Set up monitoring and logging with Grafana Prometheus Loki Elasticsearch and OpenSearch
Write and maintain CI/CD pipelines using GitHub Actions
Create Helm templates for rapid Kubernetes node deployment
Automate workflows using cron jobs and Airflow DAGs

Skills

About the Role

Requirements

Responsibilities

Similar Jobs