Job Title: AI Engineer
Location: New York, NY
Type: Full time
Our client is looking for a highly skilled AI Engineer to help build next‑generation AI systems and large‑scale machine‑learning infrastructure. If you love working on cutting‑edge models, optimizing systems for performance, and building reliable AI solutions that make a real impact, this role is for you!
What You’ll Do
- Work closely with engineers, researchers, product teams, and program managers to deliver impactful AI-powered products
- Design, build, test, deploy, and support components across the AI stack—including foundation model training, LLM inference, vector search, guardrails, model evaluation, governance, and observability
- Use a modern tech stack spanning high-performance cloud clusters, HuggingFace tools, vector databases, guardrail frameworks, PyTorch, and more
- Innovate and implement advanced techniques to improve model scalability, throughput, latency, and cost efficiency
- Contribute to the long-term architecture and strategic direction of foundational AI systems
- Bring clarity to complex, ambiguous challenges and help drive solutions from idea to production
Who You Are
- A builder at heart—you care about quality, reliability, and creating systems that scale
- Passionate about staying current with emerging AI research and applying new techniques thoughtfully
- Curious and analytical—you ask the right questions, dig deep into problems, and communicate insights clearly
- Strong technical foundation across engineering, math, and AI, with the ability to spot optimization opportunities others might miss
- Someone who thrives in fast-evolving, exploratory environments and isn’t afraid to pioneer new approaches
Basic Qualifications
- Bachelor’s degree in Computer Science, AI, Electrical/Computer Engineering, or similar + 3 years of experience developing AI/ML technologies
- OR
- Master’s degree in a related field + 1 year of experience
- At least 3 years of hands-on programming in Python, Go, Scala, or Java
Preferred Qualifications
- 4+ years building and deploying scalable AI solutions on cloud platforms (AWS, GCP, Azure, or private cloud)
- Experience developing and supporting production-grade AI services
- Background working with LLM inference, vector search, guardrails, memory systems, or similar components in languages like Python, C++, Java, C#, or Go
- Expertise in optimization techniques to improve training/inference performance, hardware utilization, latency, and cost