AI language models, used to generate human-like text to power chatbots and create content, are also revolutionizing biology by treating complex biological data like a language. Language models are increasingly used, for example, to find patterns in DNA and proteins, to make predictions and speed research into biological complexity. A critical gap, however, is the lack of a method to estimate the reliability of these predictions.