Search...

Forward Deployed Infrastructure Engineer

Skills

About the Role

You will serve as the technical point of contact during customer trials, running standardized and custom benchmarks to validate performance. You will design, run, and analyze performance tests across customer workloads, diagnose GPU and NCCL issues, optimize container and cluster configurations, and produce clear reports and handoffs. You will maintain benchmarking scripts, containers, and environments and iterate on configurations to close performance gaps.

Requirements

  • Experience running infrastructure performance tests or ML model benchmarks (training or inference)
  • Strong knowledge of GPU cloud infrastructure and workload bottlenecks
  • Clear and fast written communication
  • Ability to manage multiple trials and projects concurrently
  • Familiarity with AWS Lambda CoreWeave Runpod and similar GPU cloud providers
  • Prior customer-facing experience in startup or devtools settings (preferred)
  • Background as an ML engineer solutions architect or technical account manager (preferred)

Responsibilities

  • Serve as the technical point of contact during customer trials
  • Design and run performance and benchmark tests across customer workloads
  • Diagnose performance issues and recommend fixes
  • Package results into clear reports and handoffs
  • Maintain benchmarking scripts containers and environments
  • Identify performance gaps and optimize cluster configurations