Machine Learning Engineer (Application Support Automation)
Overview
A leading financial services organization is seeking a Machine Learning Engineer to drive the transformation of application support through AI and automation.
This role sits within the Application Support organization but is heavily focused on designing and building machine learning solutions, AI frameworks, and intelligent automation tools that reduce manual support effort and improve system reliability.
You will leverage modern GenAI and code-generation platforms such as GitHub Copilot, Google Gemini, OpenAI, and other LLM technologies to develop scalable solutions that enable proactive issue detection, automated remediation, and intelligent support workflows.
Key Responsibilities
- Design, build, and deploy machine learning systems and AI frameworks that automate application support and IT operations workflows
- Develop end-to-end ML solutions including data pipelines, feature engineering, model training, and production deployment
- Build AI-powered tools and internal platforms, such as:
- Intelligent support copilots
- Automated diagnostics and troubleshooting engines
- Self-healing and auto-remediation frameworks
- Create predictive and prescriptive models for:
- Incident detection and prevention
- Root cause analysis (RCA)
- Capacity and performance forecasting
- Apply LLMs, NLP, and GenAI techniques to operational data, including logs, alerts, tickets, and documentation
- Design and implement AI-driven automation frameworks to streamline repetitive support tasks and reduce MTTR
- Leverage tools like Copilot, Gemini, OpenAI, Claude, etc. to accelerate development of ML models, scripts, and automation pipelines
- Integrate ML models and AI services into production support environments, monitoring tools, and DevOps pipelines
- Partner with Application Support, SRE, and Engineering teams to embed AI capabilities into day-to-day operations
- Build scalable model monitoring, feedback loops, and continuous improvement pipelines
- Develop dashboards and metrics to track AI effectiveness, system performance, and automation impact
Required Qualifications
- 5–10+ years of experience in:
- Machine Learning Engineering, Data Science, or AI Engineering
- Experience supporting or automating application support / production systems
- Strong programming expertise in:
- Python (required)
- SQL and data processing frameworks
- Proven experience building:
- Machine learning platforms, tools, or reusable AI frameworks
- End-to-end ML pipelines and production-grade systems
- Automation solutions leveraging AI/ML
- Hands-on experience with ML frameworks:
- TensorFlow, PyTorch, Scikit-learn, or equivalent
- Experience with LLMs and GenAI ecosystems, including:
- OpenAI, Gemini, Anthropic, or similar
- Prompt engineering and AI-assisted development
- Strong experience building:
- Anomaly detection systems
- Predictive models and time-series forecasting solutions
- Experience working with operational data (logs, metrics, monitoring systems, incident data)
- Solid understanding of:
- Application support and production environments
- Incident management and reliability engineering concepts
- Experience deploying and managing ML models in production environments (MLOps practices)
- Familiarity with cloud platforms (AWS, Azure, or GCP) and scalable architecture design