About LangChain
At LangChain, our mission is to make intelligent agents ubiquitous. We build the foundation for agent engineering in the real world, helping developers move from prototypes to production-ready AI agents that teams can rely on. We began as widely adopted open-source tools and have grown to also offer a platform for building, evaluating, deploying, and operating agents at scale.
Today, LangChain, LangGraph, LangSmith, and Agent Builder are used by teams shipping real AI products across startups and large enterprises. Millions of developers trust LangChain to power AI teams at companies like Replit, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, Vanta, and 35% of the Fortune 500.
With $125M raised at Series B from IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we’re at a stage where we’re continuing to develop new products, growth is accelerating, and all team members have meaningful impact on what we build and how we work together. LangChain is a place where your contributions can shape how this technology shows up in the real world.
About the role
In person 5 days/week in San Francisco, CA or New York, NY
We’re hiring a Software Engineerto join the Infra team and own developer productivity across our LangGraph Cloud/Platform and LangSmith products. You’ll work closely with Infra, Backend, and Frontend to ship with confidence across Kubernetes-based services, APIs, and UI flows—and you’ll help pioneer quality practices specific to LLM applications (e.g., prompt regressions and evaluation suites).
Own test strategy end-to-end across APIs, services, UI, data, and infra (K8s/Terraform/Helm).
Stand up ephemeral test environments in Kubernetes for PRs and release candidates; seed test data and run hermetic suites.
Shift-left quality in CI/CD (GitHub Actions): parallelization, caching, deterministic seeds, flake tracking, and quality gates.
Observability for tests: rich failure artifacts (videos, logs, traces), Datadog dashboards, and actionable alerts.
Performance & reliability: baseline SLIs/SLOs for critical paths; capacity tests and regression detection.
Partner on incident workflows: reproduce issues, add focused regression tests, and improve runbooks/postmortems.
Documentation: high-signal test plans, playbooks, and contributor guidelines for writing good tests.
Example projects you might own
A PR-ephemeral E2E harness that deploys a minimal LangSmith stack on Docker in CI and runs Playwright + API suites against seeded tenants.
A k6 scenario that simulates multi-tenant traffic with queue/backpressure, surfacing p95/p99 latency regressions per release.
A flake-budget system that auto-quarantines flaky tests, opens issues with artifacts, and tracks “time-to-deflake”.
How to be successful in this role
3+ years as Infra Engineer/Software Engineer focused on
Strong hands-on experience with Python (pytest)
Familiarity with CI/CD (GitHub Actions preferred) and making pipelines fast, parallel, and reliable.
Solid understanding of API testing, mocking/stubbing, and data setup/teardown.
Comfortable defining quality bars, authoring test plans, and driving cross-team execution.
Bonus
Load/perf testing (k6), observability (Datadog, OpenTelemetry), and property-based testing (Hypothesis).
Experience testing services running on Kubernetes and containers; comfortable with logs, events, and basic kubectl.
Infra awareness: Helm/Terraform basics, Kubernetes networking, and secrets management.
SQL fluency for data validation (Postgres/ClickHouse/BigQuery).
Go/Node/React familiarity for targeted white-box tests and testability improvements.
Compensation & Benefits
We offer competitive compensation that includes base salary, meaningful equity, and benefits such as health and dental coverage, flexible vacation, a 401(k) plan, and life insurance. Actual compensation will vary based on role, level, and location. For team members in the EU and UK, we provide locally competitive benefits aligned with regional norms and regulations.
Annual salary range: $160,000- $225,000 USD