Position: Data Scientist
Location: Pasadena, CA, hybrid.
Company Overview: We are a deep-tech startup founded by experts from Google, Caltech, Stanford, and UMich, specializing in fiber optic sensing and big data analytics. Our mission is to translate vast, real-time signals into actionable insights at scale, driving innovation in data-driven industries worldwide.
Role Overview: We are seeking a Data Scientist/Engineer to support our growing data analytics and engineering efforts. In this role, you will collaborate with cross-functional teams to develop dashboards, optimize big data pipelines, build statistical models, and contribute to meaningful insights derived from vast datasets. We're looking for someone who is resourceful, proactive, outcome-oriented and takes full ownership of their work.
Key Responsibilities:
- Build and maintain interactive dashboards and apps using Panel/HoloViz and Superset.
- Implement and manage data workflows and automation pipelines using GCP, Cloud Run, and Terraform.
- Leverage mathematical and statistical methods to analyze large datasets in Snowflake, Spark, and other big data environments.
- Contribute to process automation and documentation for data-related tasks.
- Collaborate with senior engineers and data scientists to refine and deploy analytics solutions.
Required Qualifications:
- Minimum 2 years of experience in a commercial setting (Startup experience is a plus).
- 1+ year of experience with Panel/HoloViz object-orientated patterns with geospatial applications
- Working knowledge of Cloud Run/ECS, GCP/AWS, and strong knowledge of Terraform.
- Background in mathematics, science or related quantitative field; applied to Machine/Deep Learning applications.
- Hands-on experience creating Superset/Tableau/Looker dashboards.
- Proficiency in big data analysis (Snowflake, Spark, etc.). More than 1 year of experience.
- Proficiency in geospatial data analysis and processing (GDAL, geopandas, projections etc.)
- Experience processing and interpreting geophysical data: well logs or gravity surveys or magnetic surveys etc.
- Proficiency with CI/CD pipelines, git and PR review processes.
- Demonstrable experience with cross functional team collaboration to deliver data science and eng solutions.
- Authorized to work in the U.S.
Additional Qualifications (Preferred):
- Familiarity with machine learning frameworks such as TensorFlow, PyTorch, or Scikit-learn.
- Prior experience working with real-time data pipelines or streaming platforms like Kafka.
- Ready to work in person in Pasadena.
- Experience with Bayesian Inference
What We Offer:
- Competitive compensation and benefits.
- Opportunity to work on cutting-edge technology with real-world impact.
- A dynamic startup environment fostering growth and learning.
- Hybrid work flexibility in a vibrant research setting on Caltech Campus.
- Opportunity to travel and interact with end users.
If you’re excited about advancing fiber optic sensing and big data analytics, we want to hear from you. Apply now and help shape the future of real-time data science!