Forensic linguistics: how dark web criminals give themselves away with their language

Wait 5 sec.

Shutterstock/nomad-photo.euShannon McCoole ran one of the world’s largest dark web child abuse forums for around three years in the early 2010s. The forum provided a secure online space in which those interested in abusing children could exchange images, advice and support. It had around 45,000 users and was fortified with layers of online encryption that ensured near-complete anonymity for its users. In other words, it was a large and flourishing community for paedophiles.McCoole eventually became the subject of an international investigation led by Taskforce Argos – a specialist unit in Australia’s Queensland Police Service dedicated to tackling online child abuse networks. Key to the investigation – and McCoole’s eventual arrest and conviction – was a piece of linguistic evidence: his frequent use of an unusual greeting term, “hiyas”, as noticed by an investigating officer.Investigators began searching relevant “clear web” sites (those openly accessible through mainstream search engines) for any markers of a similar linguistic style. They knew the kinds of websites to search because McCoole would speak about his outside interests on the forum, including basketball and vintage cars. A man was discovered using the giveaway greeting on a four-wheel drive discussion forum. He lived in Adelaide and used a similar handle to the paedophile forum’s anonymous chief administrator. Another similarly named user – also using “hiyas” as a preferred greeting term – was discovered on a basketball forum. Suddenly, the police had their man. This linguistic evidence contributed to the identification, arrest and eventual conviction of McCoole. But it didn’t end there. After McCoole’s arrest, Taskforce Argos took over his account and continued to run the forum, as him, for another six months. Police were able to gather vital intelligence that led to the prosecution of hundreds of offenders and to the rescue of at least 85 child victims.McCool’s case is breathtaking, and it offers a compelling demonstration of the power of language in identifying anonymous individuals.The power of languageMy journey into forensic linguistics began in 2014 at Aston University, where I began learning about the various methods and approaches to analysing language across different contexts in the criminal justice system. A forensic linguist might be called upon to identify the most likely author of an anonymously written threatening text message, based on its language features; or they might assist the courts in interpreting the meaning of a particular slang word or phrase. The Insights section is committed to high-quality longform journalism. Our editors work with academics from many different backgrounds who are tackling a wide range of societal and scientific challenges.Forensic linguists also analyse the language of police interviews, courtroom processes and complex legal documents, pointing out potential barriers to access to understanding, especially for the most vulnerable groups in society. Without thoughtful consideration of the linguistic processes that occur in legal settings and the communication needs of the population, these processes can (and do) result in serious miscarriages of justice. A particularly egregious example of this occurred when Gene Gibson was wrongly imprisoned for five years in Australia after being advised to plead guilty to manslaughter. Gibson was an Aboriginal man with a cognitive impairment and for whom English was a third language. The conviction was overturned when the court of appeal heard Gibson had not understood the court process, nor the instructions he was given by his appointed interpreter. So forensic linguistics is not just about catching criminals, it’s also about finding ways to better support vulnerable groups who find themselves, in whatever capacity, having to interact with legal systems. This is an attempt to improve the delivery of justice through language analysis. Read more: Forensic linguistics gives victims and the wrongfully convicted the voices they deserve Something that struck me in the earliest days of my research was the relative lack of work exploring the language of online child sexual abuse and grooming. The topic had long received attention from criminologists and psychologists, but almost never linguists – despite online grooming and other forms of online child sexual offending being almost exclusively done through language.There is no doubt that researching this dark side of humanity is difficult in all sorts of ways, and it can certainly take its toll.Nonetheless, I found the decision to do so straightforward. If we don’t know much about how these offenders talk to victims, or indeed each other, then we are missing a vital perspective on how these criminals operate – along with potential new routes to catching them.These questions became the central themes of both my MA and PhD theses, and led to my ongoing interest in the language that most people never see: real conversations between criminal groups on the dark web. Anonymity and the dark webThe dark web originated in the mid-1990s as a covert communication tool for the US federal government. It is best described as a portion of the internet that is unindexed by mainstream search engines. It can only be accessed through specialist browsers, such as Tor, that disguise the user’s IP address.This enables users to interact in these environments virtually anonymously, making them ideal for hidden conversations between people with shared deviant interests. These interests aren’t necessarily criminal or even morally objectionable – consider the act of whistleblowing, or of expressing political dissent in a country without free speech. The notion of deviance depends on local and cultural context.Nonetheless, the dark web has become all but synonymous with the most egregious and morally abhorrent crimes, including child abuse, fraud, and the trafficking of drugs, weapons and people. Combating dark web crime centres around the problem of anonymity. It is anonymity that makes these spaces difficult to police. But when all markers of identity – names, faces, voices – are stripped away, what remains is language. And language expresses identity.Through our conscious and unconscious selections of sounds, words, phrases, viewpoints and interactional styles, we tell people who we are – or at least, who we are being from moment to moment. Language is also the primary means by which much (if not most) dark web crime is committed. It is through (written) linguistic interaction that criminal offences are planned, illicit advice exchanged, deals negotiated, goals accomplished. For linguists, the records and messages documenting the exact processes by which crimes are planned and executed become data for analysis. Armed with theory and methods for understanding how people express (or betray) aspects of their identity online, linguists are uniquely placed to address questions of identity in these highly anonymous spaces.What kind of person wrote this text?The task of linguistic profiling is well demonstrated by the case of Matthew Falder. Falder pleaded guilty to 137 charges relating to child sexual exploitation, abuse and blackmail in 2018. The case was dubbed by the National Crime Agency (NCA) as its first ever “hurt-core” prosecution, due to Falder’s prolific use of “hidden dark web forums dedicated to the discussion and image and video sharing of rape, murder, sadism, torture, paedophilia, blackmail, humiliation and degradation”.As part of the international investigation to identify this once-anonymous offender, police sought out the expertise of Tim Grant, former director of the Aston Institute for Forensic Linguistics, and Jack Grieve from the University of Birmingham. Both are world-leading experts in authorship analysis, the identification of unknown or disputed authors and speakers through their language. The pair were tasked with ascertaining any information they could about a suspect of high interest, based on a set of dark web communications and encrypted emails.Where McCoole’s case was an example of authorship analysis (who wrote this text?), Falder’s demanded the slightly different task of authorship profiling (what kind of person wrote this text?).When police need to identify an anonymous person of interest but have no real-world identity with which to connect them, the linguist’s job is to derive any possible identifying demographic information. This includes age, gender, geographical background, socioeconomic status and profession. But they can only glean this information about an author from whatever emails, texts or forum discussions might be available. This then helps them narrow the pool of potential suspects. Grant and Grieve set to work reading through Falder’s dark web forum contributions and encrypted emails, looking for linguistic cues that might point to identifying information. They were able to link the encrypted emails to the forum posts through some uncommon word strings that appeared in both datasets. Examples included phrases like “stack of ideas ready” and “there are always the odd exception”.They then identified features that offered demographic clues to Falder’s identity. For example, the use of both “dish-soap” and “washing-up liquid” (synonymous terms from US and British English) within the same few lines of text. Grant and Grieve interpreted the use of these terms as either potential US influence on a British English-speaker, or as a deliberate attempt by the author to disguise his language background. Ultimately, the linguists developed a profile that described a highly educated, native British English-speaking older man. This “substantially correct” linguistic profile formed part of a larger intelligence pack that eventually led to Falder’s identification, arrest and conviction. Grant’s and Grieve’s contribution earned them Director’s Commendations from the NCA.Linguistic strategiesThe cases of McCoole and Falder represent some of the most abhorrent crimes that can be imagined. But they also helped usher into public consciousness a broader understanding of the kinds of criminals that use the dark web. These online communities of offenders gather around certain types of illicit and criminal interests, trading goods and services, exchanging information, issuing advice and seeking support.For example, it is not uncommon to find forums dedicated to the exchange of child abuse images, or advice on methods and approaches to carrying out various types of fraud.In research, we often refer to such groups as communities of practice – that is, people brought together by a particular interest or endeavour. The concept can apply to a wide range of different communities, whether professional-, political- or hobby-based. What unites them is a shared interest or purpose.But when communities of practice convene around criminal or harmful interests, providing spaces for people to share advice, collaborate and “upskill”, ultimately they enable people to become more dangerous and more prolific offenders. Read more: What is the dark web and how does it work? The emerging branch of research in forensic linguistics of which I am part explores such criminal communities on the dark web, with the overarching aim of assisting the policing and disrupting of them.Work on child abuse communities has shown the linguistic strategies by which new users attempt to join and ingratiate themselves. These include explicit references to their new status (“I am new to the forums”), commitments to offering abuse material (“I will post a lot more stuff”), and their awareness of the community’s rules and behavioural norms (“I know what’s expected of me”).Research has also highlighted the social nature of some groups focused on the exchange of indecent images. In a study on the language of a dark website dedicated to the exchange of child abuse images, I found that a quarter of all conversational turns contributed to rapport-building between members – through, for example, friendly greetings (“hello friends”), well-wishing (“hope you’re all well”) and politeness (“sorry, haven’t got those pics”). Dark web criminals have to abide by strict social rules. Shutterstock/Zuyeu Uladzimir This demonstrates the perhaps surprising importance of social politeness and community bonding within groups whose central purpose is to trade in child abuse material.Linguistic research on dark web criminal communities makes two things clear. First, despite the shared interest that brings them together, they do not necessarily attract the same kinds of people. More often than not they are diverse, comprising users with varied moral and ideological stances. Some child abuse communities, for example, see sexual activity with children as a form of love, protesting against others who engage in violent abuse. Other groups openly (as far as is possible in dark web settings) seem to relish in the violent abuse itself. Likewise, fraud communities tend to comprise people of highly varied motivations and morality. Some claim to be seeking a way out of desperate financial circumstances, while others proudly discuss their crimes as a way of seeking retribution over “a corporate elite”. Some are looking for a small side hustle that won’t attract “too many questions”, while a small proportion of self-identifying “real fraudsters” brag about their high status while denigrating those less experienced. A common practice in these groups is to float ideas for new schemes – for example, the use of a fake COVID pass to falsely demonstrate vaccination status, or the use of counterfeit cash to pay sex workers. That the morality of such schemes provokes strong debate among users is evidence that fraud communities comprise different types of people, with a range of motivations and moral stances.Community rules – even in abuse forumsPerhaps another surprising fact is that rules are king in these secret groups. As with many clear web forums, criminal dark web forums are typically governed by “community rules” which are upheld by site moderators. In the contexts of online fraud – and to an even greater extent, child abuse – these rules do not just govern behaviour and define the nature of these groups, they are essential to their survival. Rules of child sexual exploitation and abuse forums are often extremely specific, laying out behaviour which are encouraged (often relating to friendliness and support among users) as well those which will see a user banished immediately and indefinitely. These reflect the nature of the community in question, and often differ between forums. For instance, some forums ban explicitly violent images, whereas others do not. Rules around site and user security highlight users’ awareness of potential law enforcement infiltration of these forums. Rules banning the disclosure of personal information are ubiquitous and crucial to the survival and longevity of these groups. Read more: Our research on dark web forums reveals the growing threat of AI-generated child abuse images Dark web sites often survive only days or weeks. The successful ones are those in which users understand the importance of the rules that govern them.The rise of AIResearching the language of dark web communities provides operationally useful intelligence for investigators. As in most areas of research, the newest issue we are facing in forensic linguistics is to try and understand the challenges and opportunities posed by increasingly sophisticated AI technologies.At a time when criminal groups are already using AI tools for malicious purposes like generating abuse imagery to extort children, or creating deepfakes to impersonate public figures to scam victims, it is more important than ever that we understand how criminal groups communicate, build trust, and share knowledge and ideas.By doing this, we can assist law enforcement with new investigative strategies for offender prioritisation and undercover policing that work to protect the most vulnerable victims.As we stand at this technological crossroads, the collaboration between linguists, technology and security companies, and law enforcement has become more crucial than ever. The criminals are already adapting. Our methods for understanding and disrupting their communications must evolve just as quickly.For you: more from our Insights series:Underground data fortresses: the nuclear bunkers, mines and mountains being transformed to protect our ‘new gold’ from attack‘I have it in my blood and brain … I still haven’t been able to shake this nightmare off.’ How voices from a forgotten archive of Nazi horrors are reshaping perceptions of the HolocaustInside Porton Down: what I learned during three years at the UK’s most secretive chemical weapons laboratoryIgnored, blamed, and sometimes left to die – a leading expert in ME explains the origins of a modern medical scandalTo hear about new Insights articles, join the hundreds of thousands of people who value The Conversation’s evidence-based news. Subscribe to our newsletter.Emily Chiang has received funding from UKRI - Innovate UK.