ABOUT THE COMPANY
We're building autonomous research agents for recursive self-improvement (multi-agent systems that propose, run, and analyze machine learning experiments). We're a small team based in San Francisco, on-site
ABOUT THE ROLE
You'll build and maintain the ML systems and pipelines that our research runs on top of: data pipelines, training infrastructure, evaluation tooling, deployment, observability. The work bridges research and production, and you'll be the person who makes "we ran an experiment" actually mean "we ran it correctly, at scale, with results we trust."
This is a senior ML engineering role. You'll own systems end-to-end. You'll work with researchers daily and translate research code into infrastructure that the team can rely on. You'll move fast and you'll be measured on whether your systems make the team faster.
WHAT YOU'LL DO
- Build and maintain the training, evaluation, and deployment pipelines that our research runs on
- Take research code from prototype to production: refactor, harden, instrument, test
- Design observability into our ML systems (metrics, logs, traces, eval dashboards) so failures surface fast
- Own data pipelines for training and evaluation: ingest, dedup, version, validate
- Work closely with researchers to understand what they need, what's slow, and what's brittle
- Set engineering standards across our ML stack (testing, reviews, runbooks) so the team scales
- Contribute to architectural decisions that shape how research and
production interact
WHAT WE'RE LOOKING FOR
- Senior ML engineer with 6+ years building production-grade ML systems
- Track record across the full lifecycle: data, training, evaluation, deployment, monitoring
- Strong distributed systems experience; you've shipped systems that have to
be on
- Fluent Python, fluent with at least one of (PyTorch, JAX); comfortable at the systems-level when needed
- Comfortable with experimentation infrastructure (Ray, Slurm, Kubernetes, or
similar)
- Bias toward shipping; you prefer working code over working diagrams
- Strong written communication
NICE TO HAVE
- Experience building experimentation platforms or research infrastructure
at a frontier ML lab
- Background in distributed training systems
- Open-source contributions to ML infrastructure
- History of working effectively with small senior teams
THIS ROLE IS PROBABLY NOT FOR YOU IF
- You want to do research with engineering as a side activity: this is engineering as the main thing
- Cross-functional work with researchers (translation, scoping, education) doesn't appeal
- Long-running ownership of running systems isn't appealing: this role has it