The Hidden Auditory Knowledge Inside Language Models

Wait 5 sec.

Text-only LLMs may already know enough about sound to predict downstream audio model performance before an encoder is ever attached.