AI Researcher

Traversal • Full-time • New York, NY, United States • 14h ago

About Traversal

Traversal is the AI Site Reliability Engineer (SRE) for the enterprise—already trusted by some of the largest companies in the world to troubleshoot, remediate, and even prevent the most complex production incidents. Our mission is to free engineers from endless firefighting and enable them to focus on creative, high-impact work.

Our roots remain deeply embedded in AI research, and we’re channeling that scientific rigor and creativity into building the premier AI agent lab for the enterprise. Hence, what we’re proudest of is assembling the most talented yet nicest group of individuals, including researchers from MIT, Harvard, and Berkeley, to world-class engineers from industry: Citadel Securities, Cockroach Labs, Datadog, DE Shaw, ServiceNow, Glean, Perplexity, Pinecone, and more, to take on one of the hardest problems for AI to solve. Without the entire team, none of this would be possible.

The Role

As an AI Researcher at Traversal, you'll work on improving the accuracy and speed of our agents — systems that autonomously diagnose and resolve production incidents for some of the world’s largest enterprises.

This is a hands-on, production-oriented research role. You will design experiments, run them on real data, and ship improvements. Not flag them — ship them.

You'll work end-to-end: identify a failure mode in agent reasoning, design an intervention, evaluate it against real customer traces, and get it into production. The tooling and infrastructure are in place. The research problem is hard. The feedback loop is fast. This is not a publish-papers role. It is a make-the-agent-work role – build-and-ship cutting edge AI.

Responsibilities

LLM & Agent Research: Prototype and evaluate prompting strategies, reasoning workflows, and tool-use policies for agents operating on large-scale observability data and complex troubleshooting workflows. Ship improvements to production.
Evaluation Design: Build and maintain eval harnesses that measure real accuracy improvements on actual customer incident types — not just benchmark scores. Own the loop from hypothesis to production measurement.
Cross-Team Collaboration: Work closely with AI engineers, infrastructure teams, and product leads to bring research into production and close the loop between experimentation and impact.
Stay on the Frontier: Track developments in LLMs, agent architectures, and AI alignment, translating insights into actionable improvements for Traversal’s domain.
Training & Alignment: Apply fine-tuning, reinforcement learning, and reward modeling techniques to align AI behavior with real-world SRE workflows.
Synthetic Data & Experimentation: Design pipelines to generate synthetic incidents and observability signals, enabling scalable training and testing in data-scarce environments.

Requirements

PhD in Computer Science, Electrical Engineering, Statistics, or a related technical field; demonstrated depth in LLMs, agents, or applied machine learning
Deep applied AI expertise, including strong working knowledge of LLMs, transformers, reinforcement learning, or neural networks in agentic systems
Strong judgment in model evaluation and experimental iteration to improve product accuracy and behavior
Strong software engineering depth, with the ability to work effectively in a complex production codebase and ship production-quality code
Some experience shipping AI or ML systems to production
Ability to run rigorous experiments, interpret results, and quickly translate learnings into product improvements
Startup or early-team experience, with comfort operating in ambiguous environments and building without mature infrastructure

Nice to Have

Experience in SRE, observability, or backend systems, especially when paired with strong AI/ML depth
Experience with RLHF, synthetic data pipelines, or LLM evaluation tooling
Contributions to open-source agent frameworks such as LangGraph, DSPy, or similar
Research experience in LLMs, agents, or reinforcement learning, including publications in venues such as NeurIPS, ICML, or ICLR; top-tier conference publications are a plus

Compensation

We offer competitive compensation, startup equity, health insurance, and additional benefits. The U.S. base salary range for this full-time, in-person role in New York is $160,000–$300,000, plus equity and benefits. Our salary ranges are based on location, level, and role. Individual compensation is determined by experience, skills, and job-related knowledge.

Why You Should Join Us

We’ll make sure you’re fully supported with health insurance, a great tech setup, flexible time off, and plenty of in-office snacks. We offer competitive salary and equity packages, and take thoughtful consideration with every hire on our small, high-impact team.

Traversal is fully in-office, 5 days a week, based in New York near Madison Square Park. We have a collaborative, hard-working culture and are energized by building the future of AI-powered software maintenance.

Working here means owning meaningful parts of the product, having the flexibility to move fast, and learning constantly. This is a place to grow your career, make a real impact, and help define a new category of infrastructure software.