The Associated Press is an independent global news organization dedicated to factual reporting. Founded in 1846, AP today remains the most trusted source of fast, accurate, unbiased news in all formats and the essential provider of the technology and services vital to the news business. More than half the world's population sees AP journalism every day.
Why this role matters:
Partnering with Machine Learning Engineers, Data Scientists, and Platform Engineering, the Machine Learning Operations Engineer owns the production lifecycle of machine‑learning systems at AP. This role is responsible for deploying, operating, scaling, monitoring, and governing ML workloads so they run reliably, securely, and cost‑effectively in production.
The Machine Learning Operations Engineer ensures that models and inference pipelines built by ML Engineers can be safely promoted across Dev, QA, and Prod, meet operational SLAs, and evolve without introducing instability or uncontrolled cost.
This is an individual contributing production operations role, focused on runtime behavior, infrastructure, and reliability. It will report directly to our Director, Application Operations.
-
Design, deploy, and operate end‑to‑end production ML pipelines across Dev, QA, and Prod environments.
-
Set up and manage AWS SageMaker pipelines, endpoints, and monitoring for large scale inference workloads, including embedding generation, named entity recognition, reranking, and video processing.
-
Own GPU and CPU infrastructure selection, scaling, and optimization, including instance benchmarking, autoscaling behavior, and load testing.
-
Deploy, monitor, and operate inference services that support hundreds of thousands of queries per day across text, image, and video pipelines.
-
Implement monitoring, alerting, drift detection, and evaluation metrics for production ML systems, tracking latency, error rates, throughput, and model/data drift.
-
Manage high-throughput I/O and data movement for large collections of media assets (text, images, video), avoiding CPU, network, and storage bottlenecks.
-
Reduce operational risk by enforcing reproducibility, observability, security, and cost controls across all production ML systems.
-
Strong experience with AWS SageMaker, including pipelines, endpoints, monitoring, and multi‑environment deployments.
The anticipated salary range for this position is $125,000 - $155,000, based on a candidate’s skills, qualifications and location. The Associated Press offers comprehensive benefits, which include:
AP seeks to build an inclusive organization grounded in respect for differences. We support all aspects of diversity and provide equal employment opportunities to all employees and applicants without regard to race, color, religion, sex, marital status, national origin, age, sexual orientation, gender identity, disability, status as a veteran, or other characteristic protected by law.