What We're Building at Simplex
At Simplex, we're building a science of intelligence. Our aim is to develop and apply a rigorous theory of latent internal structure in neural networks, and how that structure relates to computation and behavior.
We believe that when dealing with intelligence,
understanding is safety. Without genuine understanding, we can't reliably monitor, control, or even reason clearly about what these systems are doing. But these same systems also present us with a new opportunity. For the first time, we have AI complex enough to serve as testbeds for theories of intelligence, including biological. We aim to build a theory applicable to intelligence, both artificial and biological.
We have the beginnings of such a theory, grounded in the physics of information and experimentally verified in transformers. Now, we are scaling our team. Some of the near-term goals we have are building unsupervised methods that recover belief geometries in real LLMs, extending the theory to more complex cognitive tasks, and pushing toward tools reliable enough to matter for safety.
Who We're Looking For
We're looking for people who can do rigorous mathematics and get their hands dirty with real models and data; ideally someone who moves naturally between theory and experiment, and feels deeply driven to understand intelligence.
You learn across fields. Our work draws on many fields, like dynamical systems, probability, deep learning, physics, information theory, and neuroscience. You don't need to know all of it coming in, but you're the kind of person who picks things up, and follows your curiosity—and surprising experimental results—wherever they lead.
You have taste. You know the difference between a problem that matters and a problem that's merely publishable. You have opinions about research directions, not just techniques. You care about the craft of how experiments are designed, analyzed, and presented.
You're self-directed. We're a small team. You'll have real ownership over your work, which means figuring out what to do, not just how to do it. You'll work closely with the research team, but we expect you to develop and pursue your own ideas within the broader research program.
You communicate. You can explain your ideas clearly to collaborators, in writing, and on a whiteboard. Science is a team activity for us, and that requires being able to think together.
You build. You're at home in front of a whiteboard and in a terminal. We are building new theory, new code, and new experiments. You think big, but you're serious about it, and you actually try to make things happen rather than just ideating. You use AI tools, you tinker, you're excited about what's becoming possible.
You have depth in at least one quantitative field such as physics, mathematics, neuroscience, machine learning, etc. A PhD is typical but not required if you've found another way to go deep.
Current Projects
Belief discovery at scale
Finding belief-state geometry in large language models without supervision. Can we automatically identify the internal structures that encode what a model knows about the world?
Building a theory of intelligence
We have the beginning steps of a theory, but it needs to be extended and refined in a number of ways, in order to, e.g., capture internal world models of different types and apply to other neural systems (e.g., RL and biological brains).
Generalization
Why and how do neural networks generalize? Our framework suggests ways in which internal structures support out-of-distribution behavior.
Red Teaming
We have an entire team dedicated to stress-testing our own framework. Finding the boundaries, the edge cases, the places where the theory breaks down in service of figuring out what's actually true.
Biological Intelligence
The same mathematics that reveals structure in transformers might apply to biological neural networks. We plan on testing this on real brain data because ultimately we're interested in intelligence wherever it appears.
Learn More About Our Work
Our foundational result (manuscript, blog post) showed that transformers trained on next-token prediction spontaneously organize their activations into geometries predicted by Bayesian belief updating over hidden states of a world model. Even when trained on simple token sequences from hidden Markov models, complex fractals emerge in the residual stream, structures far removed from the surface statistics of the training data. We think of this work as providing the first steps into an understanding of what fundamentally we are training AI systems to do, and what representations we are implicitly training them to have.
Since then, we've pushed in several directions. In Constrained Belief Updating Explains Transformer Representations, we asked how attention implements belief updating when Bayesian inference is fundamentally recurrent. We found that attention parallelizes recurrence by decomposing belief updates spectrally across heads, and we were able to make verified predictions about embeddings, OV vectors, attention patterns, and residual stream geometry at different layers.
We've also developed a theory of in-context learning grounded in training data structure. When training data mixes multiple sources, models must infer not just what hidden state the generator is in, but which source is active. This hierarchical belief updating necessarily produces power-law loss scaling with context length and explains why induction heads emerge.
We've been asking what the most general computational framework for understanding neural network representations might be. Our initial work implied activations should lie in simplices, but we've now shown that networks discover quantum and post-quantum belief geometries when these are the minimal way to model their training data. This offers a new foundation for thinking about features, superposition, and what representations neural networks use on their own terms.
Most recently, we’ve shown that transformers naturally decompose their world model into interpretable parts. These factored belief representations provide an exponential-dimensional advantage, and suggest that we can understand and surgically intervene upon low-dimensional subspaces of large models.
For a comprehensive overview of where we are and where we're headed, see our July 2025 progress report on the Alignment Forum. You can also watch Paul and Adam discuss the research program at the FAR Seminar or read this recent interview from August 2025.
Preferred Qualities.
- PhD or equivalent in Physics, CS, neuroscience, math, or comparable
- Extensive ML experience
- Experience in interpretability
We're especially interested in people who might be overlooked by traditional hiring, unconventional backgrounds, unusual paths, the kind of candidate who doesn't fit neatly into a box but has something real. If you're not sure whether you're qualified, we'd still like to hear from you.
About Obelisk And Astera Institute
Astera Institute is an independent research organization with a $3B+ endowment, operating outside the constraints of markets and academia. We run more like a startup than a foundation — small team, minimal bureaucracy, high risk tolerance.
Obelisk is Astera's program for charting a path in a world with thinking machines. It combines three complementary efforts — Neuro (decoding how biological neural activity becomes conscious experience), AGI (neuroscience-informed approaches to engineering intelligence), and Safety (building a scientific understanding of intelligence itself). The Simplex team leads the Safety effort within Obelisk, and additionally has a team in London.
This role is employment with Astera Institute. Learn more at astera.org.
Compensation Range: $140K - $200K