AI and Publishing: Moving forward requires looking backward

15th August 2023

AI and publishing concept image

As with the rest of the world, the research sector is concerned about the impact of generative AI. While there are many positive opportunities with generative AI, it poses some serious challenges to the way research is created, published and shared. But major problems with the current scholarly publishing ecosystem already exist, particularly the emphasis placed on the journal article for research assessment. Generative AI is simply reflecting, and sometimes amplifying, these issues. Guest author Dr Danny Kingsley asks, is AI the disruption scholarly publishing needs?

There has been a great deal of hand-wringing about generative AI since OpenAI released ChatGPT in November last year. The concerns have ranged from a small number of companies holding a huge amount of power, to misinformation being so far reaching it is rapidly becoming an existential threat to the planet, meaning we need to mitigate against the risk of extinction from AI. This is, of course, not the first time we have faced ‘doom by technology’. Consider the claim in 2000 that the future doesn’t need us, because of robotics, genetic engineering, and nanotech. That prediction in turn referenced a 1990 prediction that ‘gray goo’ would take over the world.

Responses to this threat include UNESCO working with education ministers across the world to respond to ChatGPT, and the European Union passing an act to regulate artificial intelligence, with the intention to ensure “that AI systems used in the EU are safe, transparent, traceable, non-discriminatory and environmentally friendly. AI systems should be overseen by people, rather than by automation, to prevent harmful outcomes”.

Generative AI poses challenges for the higher education sector, both for academic and research integrity. Common to both is the ‘hallucinations’ issue, from students requesting articles that don’t exist from libraries to articles misidentifying the research expertise of individuals. Indeed, publications that never existed are being cited in articles in preprint servers, and these articles are being indexed in Google Scholar and ResearchGate.

But citations to non-existent articles is not new – an investigation into how a phantom article had more than 400 citations identified the article was listed in Elsevier’s reference style section. ChatGPT has simply increased this at scale.

ChatGPT is holding a mirror to the current scholarly publishing system. Many of the concerns that have been raised about ChatGPT already existed.

Positive outlook

There are clear indications that the use of generative AI in academic publishing is here to stay. APA Style has published guidelines on how to cite ChatGPT. Similarly the Committee on Publication Ethics has released guidelines for editors intending to use AI to assist with the assessment of papers, noting that AI can increase the speed of decisions and reduce the burden on editors, but the adoption of AI raises key ethical issues around accountability, responsibility, and transparency.

For all the gloom, Generative AI offers a way to address inequities with the current scholarly publishing system. For example, writing papers in English when it is not an author’s first language can be a significant barrier to participating in the research discourse. Generative AI offers a potential solution for these authors, argued in the context of medical publishing. Another argument is that the ‘knee jerk’ reactions by publishers to the use of ChatGPT in articles means we are missing the opportunity to level the playing field for English as an additional language (EAL) authors.

After all, the practice of having assistance in the writing of papers is hardly new. A study looking into prolific authors in high impact scientific journals who were themselves not researchers found a startling level of publication across multiple research areas. These authors are humans (mostly with journalism degrees), not AI.

Challenges to openness

One area where the use of generative AI to write research papers poses a serious challenge is open research. The proprietary nature of the most widely used tools means the underlying model used is not available for independent inspection or verification. This lack of disclosure of the material on which the model has trained “threatens hard-won progress on research ethics and the reproducibility of results”.

In addition, the current inability for AI to document the provenance of its data sources through citation, and lack of identifiers of those data sources means there is no ability to replicate the ‘findings’ that have been generated by AI. This has raised calls for the development of a formal specification or standard for AI documentation that is backed by a robust data model. Our current publishing environment does not prioritise reproducibility, with code sharing optional and a slow uptake of requirements to share data. In this environment, the generation of fake data is of particular concern. However, ChatGPT “is not the creator of these issues; it instead enables this problem to exist at a much larger scale”.

And that leads me to my provocation – In the same way that a decade ago, open access was a scapegoat for scholarly communication*, now generative AI is a scapegoat for the research assessment system. Let me explain.

News broke recently of a radiologist using ChatGPT to write papers and successfully publish well outside his area of expertise, including agriculture, insurance, law and microbiology. This is an excellent representation of concerns that many have expressed about the excessive production of papers ‘written’ by generative AI. While the actions of the radiologist may be shocking, this type of behaviour is not limited to the use of AI, with the admission of a Spanish meat expert who had published 176 papers in a year in multiple domains through questionable author partnerships.

The concerns being raised about how generative AI will affect scholarly publishing and academia have an underlying assumption that the current system is working. We need to ask: Is it?

Everything is fine?

There is no room here for a comprehensive list of the many and varied problems with the current scholarly publishing ecosystem. As a taster, consider the finding that predatory journals are unfortunately here to stay, the worrying amount of fraud in medical research and the finding that researchers who agree to manipulate citations are more likely to get their papers published.

Two recent studies, one European and one in Australia, reveal the level of pressure PhD and early-career researchers are under to provide gift authorship. There have also been alarming revelations about payment being exchanged for authorship, with prices depending on where the work will be published and the research area. Investigations into this are leading to a spate of retractions. But even the ‘self-correcting’ nature of the system is not working, as can be seen from the large number of citations to articles that have been retracted, with over a quarter of these happening after the retraction.

If an oversupply of journals and journal articles is already fuelling paper mills (which can, themselves, use AI to generate papers) then the whole scholarly publishing ecosystem could be about to collapse on itself. A commentary has asked whether using AI to assist writing papers will further increase the pressure to publish – given that publishing levels have increased drastically over the past decade already.

Why the rush to publish?

We are asking the wrong questions. A good example is this article which asks whether publishers should be concerned that ChatGPT wrote a paper about itself. The article goes on to discuss ‘other ethical and moral concerns’, asking “Is it right to use AI to write papers when publishing papers are used as a barometer of researcher competency, tenure, and promotion?”.

I would rephrase the question to: “Is it right that publishing papers are used as the primary assessment tool of researchers?” The singular driver for almost all of these questionable research practices is the current emphasis on the published article as the only output that counts. The tail is wagging the dog.

There is already a groundswell against the current research assessment system, with organisations such as the More than Our Rank initiative from INORMS, and the Coalition for Advancing Research Assessment (CoARA) both building on the San Francisco Declaration on Research Assessment (DORA). Whole countries are moving, with the Netherlands launching the ‘Room for Everyone’s Talent’ programme, and research funders such as the Wellcome Trust undertaking significant work into research culture. But this is a huge undertaking.

A few years ago, the world celebrated 350 years of scientific periodicals. In the intervening time, we have experienced the industrial revolution, two world wars and seen the internet change the world but basically the journal system has remained incredibly stable. Will generative AI finally be the disrupting force to move the system into something fit for today’s world?

* For those interested in the earlier open access argument, I co-authored a debate piece on this called “Open access: the whipping boy for problems in scholarly publication” (a whipping boy was a boy who was educated alongside a young prince or nobleman who took the punishment for them).

Share this article
Link copied to clipboard
Explore More From Digital Science