About Us:
At LangChain, our mission is to make intelligent agents ubiquitous. We build the foundation for agent engineering in the real world, helping developers move from prototypes to production-ready AI agents that teams can rely on. We began as widely adopted open-source tools and have grown to also offer a platform for building, evaluating, deploying, and operating agents at scale.
Today, LangChain, LangGraph, LangSmith, and Agent Builder are used by teams shipping real AI products across startups and large enterprises. Millions of developers trust LangChain to power AI teams at companies like Replit, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, Vanta, and 35% of the Fortune 500.
With $125M raised at Series B from IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we’re at a stage where we’re continuing to develop new products, growth is accelerating, and all team members have meaningful impact on what we build and how we work together. LangChain is a place where your contributions can shape how this technology shows up in the real world.
About the role:
Join our platform engineering team as we scale LangSmith Platform products. You'll architect and operate the critical systems that power our customers' AI observability, working directly with cutting-edge technologies at the intersection of AI and distributed systems.
Build and Scale critical systems: Design and implement high throughput data-intensive ingestion systems supporting our flagship SaaS products (LangSmith and LangGraph Platform)
Drive reliability: Build monitoring, alerting, and automated recovery systems that maintain high uptime
SDK : Design and build developer-friendly SDKs for the LangSmith platform in Python, TypeScript, Go and Java.
Solve complex problems: Debug performance bottlenecks, optimize database queries, and architect solutions for distributed system challenges
Shape platform strategy: Influence technical decisions around infrastructure, tooling, and operational practices as we grow from startup to enterprise scale
Respond to incidents: Participate in on-call rotation with focus on post-incident learning, automation and prevention
How to be successful in this role
Experience: 5+ years building and operating production systems at scale
Programming proficiency: Strong hands-on software engineering skills (Python, Go, Rust)
Database expertise: Production experience with OSS datastores (PostgreSQL, Redis)
Infrastructure expertise: Deep knowledge of Cloud Object Storage, Kubernetes, containerized infrastructure, cloud platforms (e.g. GCP)
Observability mastery: Hands-on experience with observability stacks (Datadog, Prometheus/Grafana, OpenTelemetry or similar)
Operational mindset: "You build it, you run it, you own it" philosophy with the focus on sustainable practices
Compensation & Benefits
We offer competitive compensation that includes base salary, meaningful equity, and benefits such as health and dental coverage, flexible vacation, a 401(k) plan, and life insurance. Actual compensation will vary based on role, level, and location. For team members in the EU and UK, we provide locally competitive benefits aligned with regional norms and regulations.
Annual salary range: $175,000-$225,000 USD for Senior Engineers