AI Architect

Scale AI • San Francisco, CA; New York, NY • 1d ago

About the role

We’re hiring an AI Architect to sit at the intersection of frontier AI research, product, and go-to-market. You’ll partner closely with ML teams in high-stakes meetings, scope and pitch solutions to top AI labs, and translate research needs (post-training, evals, alignment) into clear product roadmaps and measurable outcomes. You’ll drive end-to-end delivery—partnering with AI research teams and core customers to scope, pilot, and iterate on frontier model improvements—while coordinating with engineering, ops, and finance to translate cutting-edge research into deployable, high-impact solutions.

What you’ll do

Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need.
Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment.
Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans.
Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice.
Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings.
Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders; define success metrics; surface risks early; and drive programs to measurable outcomes.
Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers.
Build evaluation rigor at the frontier: design and stand up robust evaluation frameworks (e.g., RLVR, benchmarks), close the loop with data quality and feedback, and share learnings that elevate technical execution across accounts.

You have

Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments.
Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration.
Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models.
Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs.
Executive presence with world-class researchers and enterprise leaders; excellent writing and storytelling.
Bias to action: you ship, learn, and iterate.

How you’ll work

Customer-obsessed: start from real research needs; prototype quickly; validate with data.
Cross-functional by default: align research, engineering, ops, and GTM on a single plan; communicate clearly up and down.
Field-forward: expect regular customer time and research leads; light travel as needed.

What success looks like

Clear wins with top labs: pilots that convert to scaled programs with strong eval signals.
Reusable alignment & eval building blocks that shorten time-to-value across accounts.
Crisp internal docs (PRDs, experiment readouts, exec updates) that drive decisions quickly.