The Mission
At GameBrain, we believe data is our only true moat. We need an architect to build the "Refinery" that turns thousands of raw, uncalibrated football films into a gold-standard, 3D-ready training engine. You will be responsible for the entire lifecycle of a video, from the moment it’s uploaded to a Cloud-native ingestion point to the moment it’s a versioned, auto-labeled tensor ready for training.
The Role
We are looking for a Senior Engineer who believes that Data > Algorithms. While our Lead CV Engineer focuses on the core inference models, you will own the Automated Data Factory. Your goal is to eliminate the need for massive offshore labeling teams by building a programmatic labeling system that uses AI to label AI data. You will design, build, and maintain the systems that turn raw All-22 film into "Golden" ground truth at scale.
The Responsibilities
- Build the Flywheel: Design and implement a multi-modal autolabeling pipeline (2D Boxes, Temporal Tracking, Homography) using Active Learning and Weak Supervision.
- Teacher vs. Student Architecture: Make high-level decisions on when to use off-the-shelf Foundation Models (e.g., SAM, DinoV2, GPT-4o) for pre-labeling versus training custom "Teacher" models from scratch for football-specific nuances.
- Human-in-the-Loop (HITL) Tooling: Use your web development experience to build high-leverage internal dashboards. These tools will allow our domain experts to verify 10,000 frames of homography or player tracks in minutes.
- System Ownership: You own the full data pipeline, from ingestion and programmatic denoising to producing the final training-ready artifacts.
- Strategic Evaluation: Architect the "Golden" eval sets. You are the final gatekeeper of data quality, ensuring our models are tested against the most difficult and representative "edge cases" of the sport.
Requirements
- Senior-Level Implementation: You are a builder. You don't just research; you write the production code that manages data flows and model training loops.
- Computer Vision Depth: Strong experience with PyTorch/TensorFlow and a deep understanding of 2D object detection, multi-object tracking (MOT), and geometric vision (homography/camera calibration).
- The Compounding Data Mindset: Experience with programmatic labeling, labeling functions, or weak supervision. You understand how to combine noisy heuristics into a high-confidence label.
- Web/Full-Stack Lite: Proficient enough in React/Next.js and FastAPI/Flask to build functional, performant internal tools that don't look like an afterthought.
- Battle Scars: You’ve seen what happens when "garbage in" leads to "garbage out." You’ve managed large-scale datasets and have solved the "long tail" of data errors that manual labeling teams always miss.
Bonus Points
- Experience at Snorkel AI, Scale AI, Labelbox, or an in-house Data Engine team.
- A passion for American Football and the spatial complexities of the game.