Research Robotics/Computer Vision Engineer

Skild AI • Full-time • San Mateo, CA, US • 4h ago

Company Overview

At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machine learning is the key to unlocking these capabilities for the widespread deployment of robots within society. Our team consists of individuals with varying levels of experience and backgrounds, from new graduates to domain experts. Relevant industry experience is important, but ultimately less so than your demonstrated abilities and attitude. We are looking for passionate individuals who are eager to explore uncharted waters and contribute to our innovative projects.

Position Overview

Skild AI, Inc. seeks a Research Robotics/Computer Vision Engineer in San Mateo, CA responsible for developing perceptive, intelligent, and adaptable robotic systems capable of learning and performing tasks with a focus on 3D computer vision and autonomous navigation. This includes designing perception pipelines, optimizing SLAM systems, and creating learning-based algorithms for robust robotic control in real-world environments. Specific duties include: (i) implementing perception on robots to enable safe exploration and navigation in real world environments in collaboration with the locomotion team; (ii) reconstructing an entire scene in 3D using monocular images, estimating camera poses, optimizing and streamlining 3D SLAM; (iii) developing a set of software tools for localization of a robot using only visual inputs; (iv) building robust software to enable life-long mapping on a robot via optimally merged pose-graphs; (v) visual servoing wrt objects detected/ tracked to control robot motion; (vi) researching novel techniques to detect and cater to glare during robotic mapping and navigation; (vii) building infrastructure and pipeline and collecting data to enable streaming of hand movements for training robot manipulation tasks such as pick and place; and (viii) maintaining a camera and 2D lidar based navigation stack, including fixing bugs, adding new customer feature requests, and ensuring successful deployments.

Responsibilities

(i) implementing perception on robots to enable safe exploration and navigation in real world environments in collaboration with the locomotion team
(ii) reconstructing an entire scene in 3D using monocular images, estimating camera poses, optimizing and streamlining 3D SLAM
(iii) developing a set of software tools for localization of a robot using only visual inputs
(iv) building robust software to enable life-long mapping on a robot via optimally merged pose-graphs
(v) visual servoing wrt objects detected/ tracked to control robot motion
(vi) researching novel techniques to detect and cater to glare during robotic mapping and navigation
(vii) building infrastructure and pipeline and collecting data to enable streaming of hand movements for training robot manipulation tasks such as pick and place
(viii) maintaining a camera and 2D lidar based navigation stack, including fixing bugs, adding new customer feature requests, and ensuring successful deployments.

Minimum Requirements

Must have a master's degree (or foreign equivalent) in Computer Vision, Robotics, or a directly related discipline and one (1) year of experience in Machine Learning or Data Science.
Must have any experience with or knowledge of each of the following: (i) reconstructing 3D scenes using monocular videos, meshes, pointclouds, Neural Radiance Fields, and Gaussian Splats; (ii) reconstructing rigid and articulated hand-held objects from videos, including inferring the time-varying hand configurations and relative poses of the objects; (iii) using generative computer vision, including diffusion models to guide reconstruction, or addressing occlusion and limited viewpoint variations in videos via data driven priors; (iv) optimizing attention-based models for perception used in autonomous navigation systems; (v) using Neural Architectural Search (NAS) to find better perception backbone architectures with higher accuracies and lower latencies; and (vi) cloud-based training in AWS, Google cloud, or Vetex AI and optimized data loading for cloud based distributed training for deep learning workloads (e.g. Pytorch dataloader, or sharding) with hardware-in-loop.
Experience can be concurrent.

Apply online at skild.ai/career.

Base Salary Range

$250,000—$300,000 USD