Job Title: AI Engineer
Location: Hybrid - New York, New York, United States
Salary: $170,000-$230,000
Skills: Python, LLM orchestration frameworks, Browser Automation, AWS, Full Stack Development
About the Software Company / The Opportunity:
Join a dynamic leader in the software industry, focused on building next-generation platforms that leverage artificial intelligence to transform knowledge work. This is an opportunity to help architect a groundbreaking AI system that enables intelligent agents to complete complex, multi-step business tasks, reliably, at scale, and with continuous improvement baked in. As an AI Engineer, you will work alongside visionary founders and technical leaders, directly influencing how AI powers the future of productivity and workflow automation in a highly collaborative, hybrid New York City environment.
Responsibilities:
- Build and own the agent execution layer, including computer use automation, browser automation, and multi-step task orchestration.
- Design and implement the playbook system for structuring tasks and enabling agents to follow and learn from human corrections.
- Develop the observation-to-playbook pipeline, allowing agents to learn directly from user task demonstrations and generate reusable workflows.
- Construct robust evaluation frameworks to measure agent reliability, cost efficiency, and consistency compared to freestyle agents.
- Integrate with external tools and services across sales, finance, HR, legal, and customer success domains.
- Collaborate with domain experts to validate AI agent outputs, particularly in sensitive settings such as healthcare.
- Instrument effective feedback loops to ensure every human correction drives continuous agent improvement.
Must-Have Skills:
- 3+ years of experience building production AI or agent systems.
- Deep familiarity with LLM orchestration frameworks such as LangChain, LlamaIndex, or similar.
- Strong proficiency in Python and comfort working across the full stack.
- Experience with browser automation tools (e.g., Playwright, Puppeteer) or related RPA technologies.
- Demonstrated ability to architect and evaluate AI system reliability, including regression testing and output scoring.
Nice-to-Have Skills:
- Exposure to multi-agent architectures or retrieval-augmented generation (RAG) pipelines.
- Hands-on experience with workflow automation tools (e.g., n8n, Vellum).
- Background in health tech or regulated, high-stakes domains.
- Experience with cloud platforms such as AWS, data systems like PostgreSQL, Apache Kafka, or Elasticsearch.
- Familiarity with TypeScript, Ruby on Rails, and modern start-up software stacks.