We are seeking a forward-thinking AI Engineer to architect and deploy intelligent systems powered by Large Language Models (LLMs), Retrieval Augmented Generation (RAG), and advanced chunking strategies. This role is ideal for engineers passionate about building scalable, production-grade AI applications using cutting-edge retrieval and generation techniques.
Key Responsibilities:
- Design and implement RAG pipelines using hybrid search (vector, keyword, semantic ranking, metadata enrichment)
- Apply chunking strategies (fixed-size, recursive, semantic, agentic)
- Fine-tune and evaluate LLMs (GPT, LLaMA, Ollama) for QA, summarization, NER, sentiment analysis
- Build multi-agent AI systems
- Integrate AI capabilities into enterprise applications using Azure OpenAI, Cognitive Search, Azure Functions, Power Automate
- Collaborate with stakeholders to translate business needs into scalable AI solutions
- Ensure model reliability, performance, and ethical compliance
Required Skills:
- 5–7 years of experience as a Full Stack Developer with 2–4 years in Generative AI/RAG
- Proficiency in Python
- Experience with Azure AI Studio, Azure Cognitive Services
- Experience with Vector Databases (Pinecone, FAISS)
- Strong knowledge of chunking techniques and embedding strategies
- Experience with MLOps tools (MLflow, Docker, CI/CD pipelines)