Machine Learning Researcher

Goliath Partners • Full-time • San Francisco Bay Area, US • $500k - $1m / year • 20h ago

A well-funded, independent AI research lab is building the next generation of multimodal foundation models—systems that understand and express ideas across text, audio, video, and 3D interactive environments in real time.

The team’s north star is humanistic general intelligence: AI that doesn’t just reason, but perceives, responds, and communicates with emotional depth, expressive nuance, and creative intent. This is a rare opportunity to work on a frontier problem space with deep technical freedom and long-term research horizons.

What You’ll Do

You will lead research and development on state-of-the-art generative video models, shaping the core architectures and algorithms that power the lab’s next major breakthroughs. Responsibilities include:

Driving research on generative foundation models for video, from ideation through prototype and production
Designing and evaluating large-scale generative algorithms and training schemes for high-fidelity video synthesis
Developing efficient image/video representations, latent spaces, and training objectives
Identifying critical research directions and contributing to the lab’s long-term video-generation roadmap
Designing and curating large, high-quality multimodal datasets, collaborating across research, product, and data teams

What You Bring

PhD in CS, EE, Mathematics, or related field or equivalent applied research experience
3+ years working in one or more of:
Text-to-image / text-to-video generation
Video diffusion or autoregressive video modeling
Image/video representation learning at scale
Large-language-model pre-training / fine-tuning
Deep expertise in modern generative modeling: Diffusion, Flow Matching, Autoregressive Transformers, Mamba-style architectures, VAE, GANs
Strong programming foundations (Python) and practical ML engineering skills
Ability to operate independently, explore novel ideas aggressively, and communicate research clearly

Preferred Experience

Publications at CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, SIGGRAPH, or similar
Hands-on experience with large-scale distributed training (FSDP, ZeRO, data/model parallelism)
Deep understanding of diffusion variants (DDIM, flow matching, rectified flows, etc.)
Strong software engineering fundamentals and an interest in building real systems, not just prototypes
Thrives in a fast, iterative, research-driven startup environment

Total Comp: 500,000-1,000,000