Table of LinksAbstract and 1. Introduction2 Concepts in Pretraining Data and Quantifying Frequency3 Comparing Pretraining Frequency & “Zero-Shot” Performance and 3.1 Experimental Setup3.2 Result: Pretraining Frequency is Predictive of “Zero-Shot” Performance4 Stress-Testing the Concept Frequency-Performance Scaling Trend and 4.1 Controlling for Similar Samples in Pretraining and Downstream Data4.2 Testing Generalization to Purely Synthetic Concept and Data Distributions5 Additional Insights from Pretraining Concept Frequencies6 Testing the Tail: Let It Wag!7 Related Work8 Conclusions and Open Problems, Acknowledgements, and ReferencesPart IAppendixA. Concept Frequency is Predictive of Performance Across Prompting StrategiesB. Concept Frequency is Predictive of Performance Across Retrieval MetricsC. Concept Frequency is Predictive of Performance for T2I ModelsD. Concept Frequency is Predictive of Performance across Concepts only from Image and Text DomainsE. Experimental DetailsF. Why and How Do We Use RAM++?G. Details about Misalignment Degree ResultsH. T2I Models: EvaluationI. Classification Results: Let It Wag!A Concept Frequency is Predictive of Performance Across Prompting StrategiesWe extend the zero-shot classification results from Fig. 2 in Fig. 8 with two different prompting strategies: the results in the main paper used the {classname} only as the prompts, here we showcase both (1) “A photo of a {classname}” prompting and (2) 80 prompt ensembles as used by Radford et al [91]. We observe that the strong log-linear trend between concept frequency and zero-shot performance consistently holds across different prompting strategies.\ B Concept Frequency is Predictive of Performance Across Retrieval MetricsWe supplement Fig. 2 in the main paper, where we showed results with the text-to-image (I2T) recall@10 metric. In Figs. 9 and 10, we present results for the retrieval experiments across all six metrics: I2T-Recall@1, I2T-Recall@5, I2T-Recall@10, T2I-Recall@1, T2I-Recall@5, T2I-Recall@10. We observe that the strong log-linear trend between concept frequency and zero-shot performance robustly holds across different retrieval metrics.\\ \\\ \\:::infoAuthors:(1) Vishaal Udandarao, Tubingen AI Center, University of Tubingen, University of Cambridge, and equal contribution;(2) Ameya Prabhu, Tubingen AI Center, University of Tubingen, University of Oxford, and equal contribution;(3) Adhiraj Ghosh, Tubingen AI Center, University of Tubingen;(4) Yash Sharma, Tubingen AI Center, University of Tubingen;(5) Philip H.S. Torr, University of Oxford;(6) Adel Bibi, University of Oxford;(7) Samuel Albanie, University of Cambridge and equal advising, order decided by a coin flip;(8) Matthias Bethge, Tubingen AI Center, University of Tubingen and equal advising, order decided by a coin flip.::::::infoThis paper is available on arxiv under CC BY 4.0 DEED license.:::\