ABOUT THE ROLE:
As an ideal candidate you have a good understanding of how highly scalable and reliable production infrastructure is built. Most of our backend infrastructure is written in Rust. So familiarity with a compiled language such as C++, Rust, or Go is highly beneficial.
RESPONSIBILITIES:
- Build the xAI API that serves our models to developers worldwide
- Own the end-to-end system responsible for high-throughput inference, handling billions of tokens per minute with low latency and high availability, including model serving infrastructure, request routing, SDK development, rate limiting, observability, and efficient scaling
BASIC QUALIFICATIONS:
- Expert knowledge of either Rust or C++
- Experience in designing, implementing, and maintaining reliable and horizontally scalable distributed systems
- Knowledge of service observability and reliability best practices
- Experience in operating commonly used databases such as PostgreSQL, Clickhouse, and MongoDB
PREFERRED SKILLS AND EXPERIENCE:
- Experience with LLM inference engines and serving frameworks (e.g., SGLang, TensorRT, vLLM)
- Experience designing or building with agent SDKs and agent orchestration frameworks
- Experience with Docker, Kubernetes, and containerized applications
- Expert knowledge of gRPC (unary, response streaming, bi-directional streaming, REST mapping)