Machine Learning Engineer

Insight Global • Full-time • San Jose, CA, US • $200k - $250k / year • 1w ago

Job Description

Insight Global is seeking a team of experienced, driven Machine Learning Engineer to join an established health technology company sitting in San Jose, CA. This is a full-time, permanent role with competitive salary, bonus, and comprehensive benefits.

In this role you'll need:

Deep Learning Frameworks: Hands-on experience with PyTorch (main focus) and familiarity with TensorFlow.

Large-Scale Model Training: Exposure to advanced training techniques like Distributed Data Parallel (DDP), Fully Sharded Data Parallel (FSDP), ZeRO, and model parallelism (pipeline/tensor). Experience with distributed training is a strong plus.

Model Optimization: Skilled in improving model performance through techniques like quantization (PTQ, QAT, AWQ, GPTQ), pruning, knowledge distillation, KV-cache tuning, and using efficient attention mechanisms like Flash Attention.

Scalable Model Serving: Understanding of how to deploy models at scale, including autoscaling, load balancing, streaming, batching, and caching. Comfortable working alongside platform engineers to build robust serving pipelines.

Data & Storage Systems: Proficient with both SQL and NoSQL databases, vector databases (e.g., FAISS, Milvus, Pinecone, pgvector), and data formats like Parquet and Delta. Familiar with object storage systems.

Code Quality: Writes efficient, clean, and maintainable code with a focus on performance.

End-to-End ML Lifecycle: Solid grasp of the full machine learning workflow—from data collection and model training to deployment, inference, optimization, and evaluation.

Required Skills & Experience

•3–5 years in ML/AI engineering roles owning training and/or serving in production at scale.

•Demonstrated success delivering high-throughput, low-latency ML services with reliability and cost improvements.

•Experience collaborating across Research, Platform/Infra, Data, and Product functions.

•Bachelors in computer science, Electrical/Computer Engineering, or a related field required; Master’s preferred (or equivalent industry experience).

•Strong systems/ML engineering with exposure to distributed training and inference optimization.