Hyperbole in Arabic YouTube: a pragmalinguistic study of artificial intelligence discourse

Wait 5 sec.

IntroductionHyperbole is a rhetorical device characterized by exaggeration to create a strong impression or emphasis such as humor, excitement, or distress. In pragmalinguistics, hyperbole is a common feature of natural language and is used extensively in both written and spoken communication (Quirk et al., 1985). It allows speakers to emphasize or intensify their messages, often for the purposes of humor, persuasion, or emotional expression (Leech, 1983). The study of hyperbole in linguistics provides valuable insights into the ways in which language users strategically employ exaggeration to achieve their communicative goals (Leech, 1983; Norrick, 1982).Hyperbole serves various pragmatic functions, such as creating vividness, evoking empathy, or strengthening a claim (Norrick, 1982). Its interpretation is highly context-dependent, as listeners must infer the speaker’s intended meaning beyond the literal semantic content (Holtgraves, 2005). Understanding hyperbole and its use in context is key to grasping the speaker’s point and the intended effect, making it a pragmatic phenomenon. Interpreting the intended meaning requires the use of real-world knowledge, rather than relying solely on literal semantics (Colston and O’Brien, 2000; McCarthy and Carter, 2004; Mora, 2009). As a widely recognized and used form of figurative language, hyperbole remains an important rhetorical device in everyday life (Beare and Meade, 2015).McCarthy and Carter (2004) and Claridge (2010) have significantly contributed to the identification and classification of hyperbole in language. McCarthy and Carter (2004) proposed a scheme that emphasizes the linguistic features characterizing hyperbolic expressions, primarily focusing on semantic exaggeration and specific linguistic markers such as lexical choices and syntactic structures. Therefore, for a segment of discourse to be classified as hyperbolic, it must exhibit at least three distinct features from a set of criteria, including disjunction with context (Norrick, 1982), shifts in footing, counterfactuality not perceived as a lie (Swartz, 1976; Bhaya, 1985; Clark, 1996), impossible worlds(Clift, 1999), listener take-up, extreme case formulations (Pomerantz, 1986), intensification, syntactic support, and relevant interpretability.Using McCarthy and Carter (2004) framework, Pérez-Sobrino and Littlemore (2020) demonstrate how hyperbole creates “extreme case formulations”, “impossible worlds”, and counterfactuality, unrealistic scenarios that emphasize a point. These exaggerations evoke emotions and rely on shared understanding, making ads more engaging and memorable. While hyperbole, often paired with irony, enhances ad impact by surprising and entertaining audiences, overuse can overwhelm them, highlighting the need for balance to maximize effectiveness.Benammar’s (2024) study of 1000 tweets (2021–2023) similarly applies McCarthy and Carter’s framework to explore hyperbole on Stan Twitter. The study shows how hyperbolic expressions, through disjunction with context, impossible worlds, and extreme case formulations, enhance emotional and rhetorical impact using humor and intensification. However, it leaves gaps in addressing listener take-up, shifts in footing, and syntactic support, suggesting the need for further research into hyperbole’s broader communicative functions.In accordance to hyperbole identification, Claridge (2010) offers a classification of hyperbole, identifying seven distinct forms: single-word, phrasal, clausal, numerical, superlative, comparison, and repetition. Each of these forms serves a unique function in communication, contributing to the overall impact of hyperbole in discourse.Claridge’s (2010) framework has been useful in examining hyperbole’s forms and functions across contexts. For example, (Wijaya, 2022) analyzed hyperbole in 31 YouTube toiletries product advertisements, identifying six out of seven forms proposed by Claridge, with phrasal hyperbole being the most prevalent and superlative hyperbole least used. Comparison hyperbole was absent, suggesting advertisers relied on other forms to create persuasive and attention-grabbing messages without explicit comparisons.Another qualitative study by Budiarti et al. (2024) applied Claridge’s (2010) framework to analyze hyperbole used by Indonesian new mothers in online communities. They identified six out of seven hyperbole forms in 60 utterances over a one-year period, with repetition notably absent. The study highlighted hyperbole’s functions in virtual communication, including expressing emotions, concretizing messages, creating humor, and establishing group identity.The current study examines hyperbole in Arabic YouTube on AI, as hyperbolic expressions are used similarly in advertising and social media to capture attention, evoke emotions, and persuade audiences (Callister and Stern, 2007; Han et al., 2023). The research systematically investigates how YouTube Arab content creators employ exaggeration to engage viewers and promote their ideas or products.Research motivationThe rise of AI systems and technologies has significant implications for how content creators shape public perception. By discussing the capabilities and limitations of AI, creators can influence audience expectations. Exaggerated claims may lead to unrealistic views of what AI can achieve, while more balanced discussions can promote a clearer understanding of its practical applications.Morley et al. (2022) highlight those misconceptions about AI’s autonomy and applicability stem from misleading media representations, leading to a misunderstanding of its role as a decision-support system rather than a replacement for human intelligence. This tendency to anthropomorphize AI can result in unrealistic fears and overshadow discussions necessary for responsible AI governance. Bazán-Gil (2023) notes that the media’s adoption of AI technologies in journalism has contributed to a broader narrative positioning AI as indispensable, potentially obscuring ethical considerations, biases, and the need for critical evaluation of their societal impact.While the proliferation of hyperbolic expressions in English language is notable, the scarcity of hyperbole-related studies in Arabic is a matter of concern. Numerous studies have explored the use, functions, and interpretation of hyperbole in English, e.g., McCarthy and Carter (2004); Claridge (2010). In contrast, only a handful of studies have investigated hyperbole in the Arabic language (Dhayf, 2012). This disparity can be primarily attributed to the dominance of English in academic research, which may have influenced the focus of pragmatic studies, with researchers prioritizing the study of hyperbole in English over other languages (Flowerdew, 2008).The hyperbolic-related studies in Arabic focus on generic and wide contexts rather than recent and evolving contexts, such as the application of AI technologies. For example, Ibrahim (2018) analyzed political speech by applying techniques of critical discourse analysis (CDA). He finds that hyperbolic expressions, alongside argumentative and rhetorical techniques, are strategically employed in political speeches to assert social power and persuade listeners of the speaker’s ideological views, though the specific types of hyperbole are not detailed, also the study was based on a single politician’s speech.Dhayef and Kadhim (2022) pragmatically analyzed hyperbolic expressions in English and Arabic football commentaries using three texts from different commentators. They find that hyperbole is most used for “emphasis” to amplify the significance or intensity of events, actions, or emotions to captivate the audience and heighten their engagement. However, this study lacks methodical depth to investigate the hyperbolic expression used within these sport-related commentaries.Abd-Alhameed and Mohammad (2024) investigated the challenges in translating hyperboles from Arabic literature to English due to source text ambiguity. They conclude that the translation of hyperboles from Arabic literary texts into English can be difficult because of the varied usage and multiple interpretations based on context. Furthermore, hyperboles in Arabic literature are influenced by cultural, social, and other differences between the two languages. However, the study could be critiqued for not providing empirical evidence, such as case studies or translator feedback, to support its conclusions.To the best of the researcher’s knowledge, most Arabic-based studies on hyperbolic expressions lack methodological rigor, often failing to apply established identification and classification frameworks like McCarthy and Carter (2004) and Claridge (2010), which limits their reliability and consistency. This highlights the significance and novelty of the current study, as it addresses these gaps by providing methodological clarity and open-access Supplementary Materials, such as datasets, to support future research.Moreover, no prior studies have specifically explored the use of hyperbolic expressions in AI-related technologies within the Arabic language context. As AI advances and impacts various aspects of life, examining the role of hyperbole in these emerging domains is essential. This research expands the scope by analyzing how hyperbolic expressions are used to convey ideas, persuade audiences, and shape perceptions amid rapid technological change. Consequently, this study contributes to the growing body of knowledge on hyperbolic expressions and the language used to describe AI to Arabic-speaking audiencesThis study contributes to the previously related work in the following ways:Identified and analyzed hyperbolic expressions in Arabic AI-related content using well-established frameworks.Provided culturally specific insights and communication strategies into how language influences audience engagement in Arabic contexts in social platforms, the case of YouTube videos and AI.Research aim and questionsThis study aims to provide a description and analysis of the language, specifically hyperbolic expressions, used to characterize AI technologies in Arabic video titles on YouTube. By examining the linguistic aspects of these titles, including the frequency, types, and functions of hyperbolic expressions, this study seeks to shed light on how AI is strategically presented and perceived by Arabic-speaking content creators, with a focus on how language is used to attract attention and evoke emotions.The researcher selected YouTube as the primary data source for this study due to its vast user base, with over two billion active users worldwide (Ceci, 2024), providing a rich and diverse pool of data across various content creators, genres, and audiences. YouTube’s search engine and filtering capabilities allow researchers to identify and collect relevant data by querying specific keywords, phrases, or topics, and narrowing down results based on factors such as upload date. This functionality ensures the data gathered is both relevant and representative of the research objectives (to be explained further in section “Research Methodology”).This research aims to investigate the use of hyperbole in AI-related YouTube videos by addressing the following research questions (RQs):RQ 1. How frequently are expressions used to describe AI-related technologies in Arabic YouTube titles within the selected sample, and how has this frequency changed over time?RQ 2. What are the most common characteristics (i.e., linguistic features) used to identify hyperbole, according to McCarthy and Carter (2004) identification conditions, in the AI-related Arabic YouTube titles collected in the sample?RQ 3. What are the most prevalent types, according to the classification of Claridge (2010), of hyperbole employed in the collected sample of AI-related Arabic YouTube titles?Insights from this study can educate content creators, marketers, and educators effectively communicate about AI to Arabic-speaking audiences on YouTube and similar platforms by understanding hyperbolic expressions.Research methodologyThis study employs a mixed-methods approach (Creswell and Clark, 2017), combining qualitative analysis for the identification and classification of hyperbole with quantitative measurements of its frequency and distribution. The research involves a two-stage process of first identifying hyperbole in the AI Arabic titles on YouTube using McCarthy and Carter (2004) identification scheme, and then classifying the identified hyperbole according to the classification scheme by Claridge (2010).In the first stage, each title is examined using McCarthy and Carter (2004) criteria for identifying hyperbole. This scheme recognizes hyperbole through a combination of semantic, grammatical, and contextual cues (Step 2). This identification phase relies on the researcher’s linguistic knowledge and interpretive skills to make judgments about whether a given phrase meets the criteria for hyperbole or not.The second stage involves framing and classifying each instance of hyperbole according to Claridge’s typology. Claridge proposes a categorization of hyperbole into several types (Step 3). Each hyperbolic title is analyzed and assigned to one or more of Claridge’s categories.This two-stage methodology allows for systematic identification and classification of hyperbolic expressions and instances. By employing established schemes from McCarthy and Carter (2004) and Claridge (2010), the analysis is grounded in current linguistic theory on hyperbole. The mixed-methods approach integrates quantitative and qualitative data to provide a more comprehensive understanding of the use of hyperbole in the sampled titles (Creswell and Clark, 2017). The main steps are thoroughly explained in the following sub-sections.Step 1: YouTube videos search, retrieval, and selectionThis research employed a targeted data collection approach to gather Arabic YouTube videos related to AI. To identify relevant content, a set of keywords and phrases commonly associated with AI was incrementally compiled.As presented in Table 1, synonyms and variations in Arabic and English broaden the search scope. Search queries combine the intended AI technology with its application or use, considering both positive and negative sentiments. This approach aims to capture a comprehensive, balanced, and representative sample of AI-related YouTube videos within the study’s scope.Table 1 Keyword categories and search terms for Arabic YouTube video retrieval in English/Arabic translation.Full size tableThe current keyword list could benefit from further refinement and expert input. However, the researcher found that expanding the list with domain-specific terminology would not significantly improve results due to YouTube’s search engine limitations, making broader keyword strategies more effective. During testing, using specific terminologies hindered the retrieval of relevant videos and introduced noise. The search process targets both video titles and descriptions, aligning with YouTube’s search strategy (Alkhulaif, 2024). The systematic creation of search strings is inspired by the keyword and instance formation strategies used in Systematic Literature Review (Chandler et al., 2019). The resulting search strings, in total 16 key strings, were used to retrieve videos from YouTube (Supplementary Table S1).To retrieve Arabic YouTube videos for this research, an open-source Programming Application Interface (API) dedicated to searching YouTube videos was utilized. The API called the “youtube-search-python” package (Kumar Saini, 2021). A Python script utilizing that package processed 16 search strings individually, scraping up to 50 pages of results per query. The results were combined into a single dataset, and duplicate entries with identical titles and URLs were removed. Additionally, more metadata associated with each retrieved video was added, such as: video ID on YouTube, published date, description, title, channel name, and video category on YouTube.The unique videos and their metadata were then manually inspected. The selection of videos from the search results was based on the following criteria:Language: Only videos with spoken Arabic audio were included. The verification process involved examining the video title, channel name, comments, and listening to a randomly selected 10–30 s segment to confirm the use of Arabic.Relevance to research topic: Videos that clearly addressed the context of Arabic YouTube videos related to AI, even if the term “AI” was not explicitly used, were included. Relevance was determined based on the video title, description, and a brief preview, in conjunction with verifying the spoken language.Upload Date: Videos uploaded between January 1, 2011, and August 3, 2024, were included to ensure clarity, consistency, and precision in the dataset. This timeframe captures a substantial span of AI-related content while maintaining consistency in reporting, despite the ongoing addition of new videos after the end date.During data collection, YouTube’s Terms of Service were followed, using only public videos and responsible practices. The code is in Supplementary Code S3.To ensure reliability in the inclusion process, 25% of the samples were randomly assigned to a second coder, a native Arabic speaker with AI knowledge, for inter-rater validation. This step balanced thorough quality checks with resource constraints. Cohen’s kappa measured agreement between the second coder and the first coder (the author), assessing the clarity and consistency of the inclusion criteria (McHugh, 2012).Step 2: Identification and verification of hyperbole instancesThe framework of McCarthy and Carter (2004) for identifying hyperbolic episodes in conversation consists of eight characteristics (pp.162–163). These characteristics are presented in Table 2.Table 2 Hyperbolic characteristics as adopted from McCarthy and Carter’s (2004) framework.Full size tableMcCarthy and Carter (2004) framework is used to identify and verify hyperbole in Arabic YouTube video titles related to AI. Titles are systematically evaluated against all eight characteristics, and those meeting at least three criteria are classified as hyperbolic.Step 3: Classification of hyperbole instancesClaridge (2010) credits Spitzbardt with establishing the first classification of hyperbole in 1963, and she developed her classification. She classifies hyperbole into the following forms (2011, pp. 46–57). These instances are presented in Table 3.Table 3 Hyperbolic instances as presented by Claridge’s (2010) classification.Full size tableClaridge’s (2010) classification scheme is used to further analyze the instances of hyperbole identified through McCarthy and Carter’s (2004) framework in the selected corpus of Arabic YouTube video titles related to AI. This approach aims to determine the specific forms of hyperbole represented and their distribution within the corpus.Step 4: Statistical analysesTo analyze the temporal trends in the presentation of AI-related in Arabic YouTube video titles, the researcher used Poisson regression to analyze the trend in video frequency over time. Poisson regression is a statistical technique designed for modeling count data, making it an appropriate method for examining the number of video uploads per year. This approach allows to assess the relationship between the dependent variable (video frequency) and the independent variable (year) while accounting for the discrete and non-negative nature of the data. Poisson regression also provides a means to test the statistical significance of the trend (using a p-value threshold of 0.05) and quantify the rate of change in video frequency. This method was chosen because it avoids the assumptions of normality required by linear regression, ensuring a more robust analysis for count-based data (Cameron and Trivedi, 2013). The statistical analysis results presented in the Results (section “Limitations and Future Work”) to support the answer for RQ1.To address RQ2 and RQ3, Chi-Square Goodness of Fit Tests were used to determine if the observed distributions of hyperbolic characteristics (RQ2) and the most prevalent types of hyperbole (RQ3) in the AI-related Arabic YouTube titles were significantly different from random distributions. Contingency tables were constructed with observed and expected frequencies for each category, and Chi-Square statistics, degrees of freedom, and p values were calculated. Significant results (p