Published on February 26, 2025 10:41 PM GMTWhen you have put a lot of ideas together to make an elaborate theory, you want to make sure, when explaining what it fits, that those things it fits are not just the things that gave you the idea for the theory; but that the finished theory makes something else come out right, in addition.—Richard Feynman, "Cargo Cult Science"Science as Not Trusting Yourself?The first question I had when I learned the scientific algorithm in school was:Why do we want to first hypothesize and only then collect data to test? Surely, the other way around—having all the data at hand first—would be more helpful than having only some of the data.Later on, the reply I would end up giving out to this question was:It's a matter of "epistemic discipline." As scientists, we don't really trust each other very much; I, and you, need your theory to pass this additional blinded check.(And here I'd also maybe gesture at the concept of overfitting.)I think in school we get the sense that "discipline" is of course intellectually good. And we're exposed to the standard stylized epicycles story, highlighting how silly ex post explanations can end up being.But I'm not sure that's a head-on answer to the original question. I don't want to overfit into a wrong hypothesis, but why do I need to blind myself to data to avoid that? Why can't I just be careful and be sensitive to how silly my epicycles are, and discount accordingly?(A little later on, I might have said something like: "AIXI by its nature will always make exactly the requisite Bayesian update over hypotheses, but we don't fully trust our own code and hardware to not fool ourselves." But that's kind of the same thing: Bayes gives you an overarching philosophy of inference, but this whole theorize-before-test thing remains seen merely as a patch for our human foibles.)A Counting Argument About ScienceHere's a maybe more satisfying characterization of the same thing, cutting more centrally at the Bayes structure connecting your brain to its complement. (To be perfectly clear, this isn't a novel insight but just a representation of the above in new language.)Let an explanation be a mathematical object, a generator that spits out observations into a brain. Then, even when you condition on all your observations so far, most surviving generators still don't resemble reality. Like, if you want the polynomials of a degree that pass through a set of degree-many points, there are infinitely many such polynomials. Even if you insist that the polynomials don't wiggle "excessively," there are still infinitely many of them. You take reasonable care about excessive wiggling and choose a polynomial that doesn't generalize. Almost all your options don't generalize.In contrast, it's super strenuous to require your generator to correctly generate an unseen prior datapoint. Not strenuous in a merely human sense: mathematically, almost all of the old generators fail this test and vanish. The test is so intense that it has burned some generalization ability into the survivors: every surviving generator is either thermodynamically lucky or is onto something about the ground truth generator.It's not that humans are particularly epistemically corrupt and are never to be trusted until proven otherwise. It's that humans aren't epistemically perfect. If we were all perfect judges of theoretical simplicity, we could do as AIXI does and update after-the-fact. But it's perfectly reasonable to let the universe judge explanations on generalization ability in place of us weighing explanations by complexity. We don't stuff the whole prior over all possible explanations into our head precisely enough to put it on a scale, and that's fine—most beings embedded inside large physical worlds don't either.Discuss