Key points
- LLMs now influence over 13.5pc of 2024 research papers
- Study analysed 15m PubMed abstracts for trends
- AI-generated content raises integrity concerns in academia
ISLAMABAD: Chances are you have come across engaging online content that was, knowingly or not, created—either in part or entirely—by a Large Language Model (LLM).
As AI tools such as ChatGPT and Google Gemini become increasingly adept at producing writing that mimics human quality, distinguishing purely human-authored material from that assisted or generated by LLMs has become more challenging, according to Phy.org, an online science, research, and technology news aggregator.
This growing uncertainty around authorship has sparked concern within academic circles, particularly regarding the subtle infiltration of AI-generated content into peer-reviewed scientific literature.
Detectable mark
To explore the extent of LLM influence in academic writing, a team of researchers from the United States and Germany analysed over 15 million biomedical abstracts on PubMed. Their aim was to determine whether LLMs had left a detectable mark on the linguistic style of published articles.
The study found that since the introduction of LLMs, there has been a noticeable rise in the use of specific stylistic word choices across academic publications. Their analysis suggests that at least 13.5 per cent of research papers published in 2024 involved some degree of LLM assistance. These findings were published in the open-access journal Science Advances.
Since ChatGPT’s public release less than three years ago, the presence of AI-generated content online has surged—prompting growing concerns about the accuracy and trustworthiness of certain academic works.
Methodological flaws
Previous attempts to measure the role of LLMs in scientific writing were often hindered by methodological flaws. Typically, such studies relied on comparing known sets of human- and AI-generated texts, a process that the researchers argue may introduce bias, as it rests on assumptions about which models scientists use and how they prompt them.
To overcome these issues, the authors of the new study instead focused on detecting shifts in word usage before and after the introduction of ChatGPT, looking for linguistic patterns indicative of LLM involvement.
Their method mirrored earlier public health research on COVID-19, which estimated the pandemic’s mortality impact by assessing excess deaths in the population. Similarly, this study examined the excess use of particular words to detect stylistic changes over time.
Expressive language
The analysis revealed a marked shift following the rise of LLMs: prior to 2024, most excess words were “content words” and largely nouns. In contrast, by 2024, there was a significant increase in the use of more stylistic and expressive language—featuring verbs like “showcasing” and adjectives such as “pivotal” and “grappling”.
By manually classifying each excess word by its part of speech, the researchers found that before 2024, 79.2 per cent of these terms were nouns. However, in 2024, the trend shifted—66 per cent were verbs and 14 per cent were adjectives.
The study also highlighted notable differences in the degree of LLM usage depending on academic discipline, geographical region, and publishing venue.