Email: [email protected]
Email: [email protected]
I am a Research Affiliate at the Center for Theoretical Physics at MIT and also a Principal Researcher at Salesforce, having arrived via acquisition of Diffeo where I was Co-Founder and Chief Technology Officer. I am also an Affiliate of the NSF AI Institute for Artificial Intelligence and Fundamental Interactions (IAIFI). I research ways in which the tools and perspectives from theoretical physics can be applied to artificial intelligence.
I am a co-author of The Principles of Deep Learning Theory, co-authored with Sho Yaida, and based on research in collaboration with Boris Hanin. You can buy a copy in print from Amazon, or directly from Cambridge University Press, or download a free draft from the arXiv! If intrigued further, check out my essay: Why is AI hard and Physics simple?
More broadly, I am interested in the interplay between physics and computation. My work in theoretical physics has focused on the relationship between black holes, quantum chaos, computational complexity, randomness, and how the laws of physics are related to fundamental limits of computation.
Previously, I was a research scientist at Facebook AI Research in NYC. Before that, I was a postdoc in the School of Natural Sciences at the Institute for Advanced Study in Princeton, NJ. I completed my Ph.D. at the Center for Theoretical Physics at MIT, funded by a Hertz Foundation Fellowship and the NDSEG. Prior to that, I was a Marshall Scholar in the UK. While there, I read for Part III of the Mathematical Tripos at Cambridge and then studied quantum information at Oxford. In a previous life (undergrad), I worked on invisibility cloaks (metamaterials and transformation optics) with David R. Smith.
My full name is very common, but nevertheless I tend to go only by Dan Roberts—which I never publish under and which unfortunately (by logical necessity) is even more common.
As an AI researcher at FAIR, Diffeo, and now at MIT and Salesforce, I have focused on applying tools from theoretical physics to gain insight into machine learning and artificial intelligence. Recently I’ve worked on understanding neural scaling laws, robust learning, stochastic gradient-based optimization, causality, and the large-width expansion for deep learning.
In 2012, I co-founded Diffeo, a startup focused on collaborative machine intelligence. As part of Diffeo Labs, I co-organized a track for the Text Retrieval Conference (TREC) called Knowledge Base Acceleration. In 2019, Diffeo was acquired by Salesforce, which explains why a bunch of these hyperlinks no longer make any sense.
I once tried to apply machine learning to particle physics.
I’m interested in black holes. I’m also interested in quantum information theory. Luckily, via the gauge/gravity duality or holography, these two subjects are intricately tied together.
Some of my work focuses on what happens when something falls into a black hole (in anti-de Sitter space). The black hole will very quickly scramble (but not destroy) the information. Black holes are thermal systems, and this is actually a manifestation of the well-known butterfly effect. We can try to think about this process in terms of its computational complexity, or we can study it as a distinguishing feature of quantum chaos.
Advancing AI theory with a first-principles understanding of deep neural networks – Blog post by Sho Yaida announcing PDLT.
How Hertz Helped an AI Company Take Shape – Writeup of the Diffeo story, from founding to acquisition by Salesforce.
Black Holes Produce Complexity Fastest – Viewpoint in the APS journal Physics (which is very nicely written) on Complexity Equals Action.
Complexity growth – Research highlight in Nature Physics (which is two paragraphs—one and a half of which are behind a paywall—and is unfortunately incomprehensible) on the same work.