AI and Scientific Publishing: ChatGPT’s Role in Disruption

0
169
Adapting to Change How AI Tools like ChatGPT Are Reshaping Scientific Publishing

If radiologist Domenico Mastrodicasa finds himself stuck when he is writing a Research paper, the doctor switches to ChatGPT ChatGPT, a chatbot that can respond to virtually any question within a matter of seconds. “I utilize ChatGPT as a sounding board,” says Mastrodicasa, who works in the University of Washington School of Medicine in Seattle. “I can write a published manuscript faster.”

Mastrodicasa is one of many researchers experimenting with generative artificial-intelligence (AI) tools to write text or code. He has to pay per month for ChatGPT Plus, a paid version of ChatGPT built on the large-language model (LLM) GPT-4 and is using it at least once a week. He particularly appreciates it in helping him find more clear ways to communicate his thoughts. Even though an Nature study suggests that scientists who utilize LLMs often are still at a low level, many are hoping that the generative AI tools will soon become routine aids in writing manuscripts, peer-review papers and grant applications.

These are only a few of the ways AI could revolutionize scientific communication and publishing. Publishers of science are playing with creating generative AI to improve their search tools for scientific research as well as editing and speedily summarize documents. Many researchers believe that non-native English users could gain the greatly from this technology. Some researchers see the concept of generative AI as a method scientists can rethink the way they analyze and present their research results. In the end, they could utilize LLMs to complete a large portion of this, which would mean that they spend less time writing research papers and more time conducting experiments.

Science and the future of AI The new age of AI and science: a Nature unique

“It’s not the main objective of any writer to write papers, it’s to conduct research,” says Michael Eisen, a computational biologist from the University of California, Berkeley and the editor-in chief for the journal eLife. He believes that it is possible that generative AI tools may even change the fundamental nature of scientific papers.

But the possibility of errors and falsehoods sabotage this view. LLMs are just engines to produce stylistically plausible output that matches the pattern in their data, not than producing precise information. Publishers are concerned that an increase in their use could result in more poorly-written or erroneous manuscriptsor even the emergence of fakes using AI.

“Anything such as this that’s disruptive can be extremely concerning,” says Laura Feetham who is the peer review manager of IOP Publishing in Bristol, UK that publishes physical science journals.

A wave of falsehoods?

Science publishing houses and other organizations have identified a variety of concerns regarding the potential consequences of the use of generative AI. The availability of AI tools that are generative AI tools can allow researchers to write low-quality research papers and, at the very least, undermine the integrity of research, says Daniel Hook, chief executive of Digital Science, a research-analytics company based located in London. “Publishers are right to be worried,” says Hook. (Digital Science belongs to Holtzbrinck Publishing Group, the majority shareholder in the Nature publishing house, Springer Nature; Nature’s news team is independent of its editorials.)

In some instances researchers have admitted to making use of ChatGPT to write their papers, but without revealing the fact. They were discovered because they had failed to take away the obvious indicators of its usage including false references, or even the program’s preprogrammed message that it’s an AI model of language.

In the ideal scenario, publishers should be able recognize LLM-generated texts. However, in practice, AI-based detection tools have proved incapable of identifying this text with confidence, while also avoiding marking human-written prose as a result from an AI.

The companies that develop commercial LLMs are currently working on watermarking output from LLMs to identify it however, no company has released this feature for text. Any watermarks that are present could be removed, according to Sandra Wachter, a legal scholar at the University of Oxford, UK who studies the legal and ethical implications of new technologies. She hopes that legislators around the world will demand watermarks or disclosure for LLMs and declare it unlawful to take away watermarks.

How do we prevent AI deepfakes from destroying our society and science

Publishers are dealing with the issue either by preventing using LLMs completely (as Science’s publisher is The American Association for the Advancement of Science has done) or, in the majority of cases seeking transparency (the policy of Nature and a number of different journals). A study that examined 100 journals and publishers revealed that at the end of May 1st, 2017 only 17 percent of the publishers as well as 70 percent of journals had published guidelines for how the generative AI could be applied but they were not uniform in how they could be utilized, according to Giovanni Cacciamani, a urologist at the University of Southern California in Los Angeles, who co-authored the research that hasn’t yet been peer-reviewed1. His colleagues and he are working together with editors of journals and scientists to create a standard set of guidelines that will allow researchers document their usage of LLMs.

Many editors are worried that generative AI may be employed to create fake, but convincing, reports. Companies that offer authorship positions or manuscripts to researchers looking to increase their publication output also known as paper mills, can make a earn a profit. A spokesperson from Science informed Nature that LLMs like ChatGPT could contribute to the paper mill issue.

One way to address these concerns could be to ask certain journals to strengthen their methods to ensure that the authors are authentic and have conducted the research they’re providing. “It’s essential for journals to determine whether or not someone actually accomplished the work they claim to have done,” says Wachter.

The editor EMBO Press located in Heidelberg, Germany, authors are required to use only institutional email addresses to submit their submissions Editorial staff also collaborate with authors and referees on video calls in the presence of Bernd Pulverer, who is the director of the scientific publications department at EMBO Press. However, he says that funders and research institutions are also required to keep track of the work of their staff as well as grant recipients more carefully. “This isn’t something that can be handed over completely to journals,” the researcher states.

Equity and inequity

When Nature interviewed researchers about what they thought the greatest advantages of AI that is generative AI could bring to science, the most favored response was that it could aid researchers who don’t are able to speak English as their primary language (see “Impacts of Generative AI’, and Nature 621; 672-675; 2023). “The application of AI tools could help improve scientific equity,” says Tatsuya Amano who is a conservation science researcher from the University of Queensland in Brisbane, Australia. Amano and his colleagues conducted a survey of over 900 environmental scientists who had published at least one research paper in English. In early-career researchers Non-native English natives reported that their paper was rejected due to writing problems over twice as often native English users did. They also had less time to write their papers2. ChatGPT as well as similar software can prove to be an “huge assistance” to these researchers according to Amano.

The impact of AI that are generative AI. Chart illustrating the results for Nature survey.

Amano who’s first language is Japanese is currently experimenting using ChatGPT and claims the experience is like working with an English-speaking native friend, but the software’s suggestions can be a bit sloppy. He wrote an editorial for Science in March, following the journal’s restriction on artificial intelligence (AI) or generative AI tools, suggesting that they could help make publishing in science more equitable so the authors are able to disclose their use, e.g. by putting the original manuscript along with an AI-generated version 3.

AI and science: What do 1600 researchers believe

LLMs aren’t the first AI-assisted software to enhance writing. However, the generative AI is actually more adaptable, claims Irene Li, an AI researcher at the University of Tokyo. She used to use Grammarly which is an AI-powered spelling and grammar checkerwhich helped her enhance her writing English however, she has changed to ChatGPT since it’s more flexible and has a better return on investment over the long term Instead of having to pay for a variety of tools, she’s subscribed to one tool that does everything. “It helps save a lot in time” the woman says.

However, the manner in the way LLMs are created could lead to inequality, according to Chhavi Chauhan Chhavi Chauhan, an AI ethicist and director of outreach to scientists of the American Society for Investigative Pathology (SIP) in Rockville, Maryland. Chauhan is concerned that free LLMs may become costly in the near future, to pay the cost of creating and maintaining the programs, and that if editors use artificial intelligence-based detection software, they’re more likely to mischaracterize the work of non-native English people as AI. A study conducted in July revealed that this happens with the latest version of GPT detectors4. “We have completely missed the unfairness these artificial AI models will generate,” she says.

Peer-review issues

LLMs can be beneficial for peer reviewers as well. After implementing ChatGPT Plus for assistants, Mastrodicasa says he’s been in a position to take on more requests for review, and use the LLM to improve his review but he hasn’t uploaded manuscripts or other information that comes from them to ChatGPT Plus, an online platform. “When you already have an idea of my work, I can polish it in a matter of hours instead of weeks,” he says. “I believe it’s inevitable that it will be a an integral part of tools in our arsenal.” Christoph Steinbeck, an informatics scientist in chemistry from the Friedrich Schiller University in Jena, Germany, has found ChatGPT Plus useful for producing quick summaries of preprints he’s studying. The preprints he reviews are available online, therefore confidentiality isn’t an issue.

A major problem is that authors could be able to rely on ChatGPT to write reviews without thinking however, the foolish act of soliciting the LLM directly to look over an article is likely to provide little more than brief summaries and suggestions for copy-editing and suggestions, according to Mohammad Hosseini, who studies ethics in research as well as integrity in Northwestern’s Galter Health Sciences Learning and Library Center located in Chicago, Illinois.

Researchers uncover shady ChatGPT usage in papers

The primary concerns about LLMs with respect to peer-review were centered around confidentiality. Many publishers — such as Elsevier, Taylor & Francis and IOP Publishing — have banned research authors from sending manuscripts or parts of text to the generative AI platforms for peer-review papers, citing fears that work may be used to feed back into the LLM’s data training set, which could violate the terms of contracts to keep work private. In June researchers at the US National Institutes of Health barred using ChatGPT as well as other software generative AI tools for producing peer reviews of grants due to concerns about confidentiality. A few weeks later, the Australian Research Council prohibited the use of the generative AI during review of grants because of the same reason after a few reviews that appeared to have been written by ChatGPT were published online.

One option to overcome the privacy issue is to utilize the private hosting of LLMs. By using these, one can be certain that the information is not being sent back to firms hosting LLMs on the cloud. Arizona State University in Tempe is testing privately hosted LLMs built on open-source models, like Llama 2 and Falcon. “It’s an easily solved problem,” says Neal Woodbury Chief scientist and technology officer of The university’s Knowledge Enterprise, who advises university officials on research projects.

Feetham believes that if it were clearer what LLMs manage, store, and utilize the information they store and how they can possibly be integrated into review systems that publishers currently utilize. “There are real opportunities there if these tools are used properly.” Publishers have been using machine-learning and natural-language-processing AI tools to assist with peer review for more than half a decade, and generative AI could augment the capabilities of this software. Wiley’s spokesperson Wiley claims that Wiley is currently experimenting with the use of generative AI to assist in screening manuscripts, select reviewers, and confirm the identity of authors.

Concerns about ethics

Certain researchers believe that LLMs are not ethically sound enough to be included in the publishing process. The main issue is the manner in which LLMs function: by trawling Internet content, without a the need for consent, bias or copyright rights, claims Iris van Rooij, a cognitive scientist at Radboud University in Nijmegen, the Netherlands. She says that the use of generative AI amounts to “automated plagiarism designed” as users aren’t aware of the source of their data. If researchers were aware of this issue then they wouldn’t be tempted to make use of the generative AI instruments, as she claims.

The real cost of science’s language barrier non-native English language speakers

Certain news outlets have blocked ChatGPT’s robot from accessing their sites Media report suggests that a few businesses are considering litigation. While scientific publishers haven’t taken this step in the public domain, Wiley told Nature that it was “closely following the reports from industry and lawsuits claims that the generative AI models are capturing protected information for training purposes, while ignoring any restrictions currently in place on this information”. Wiley also said that it had urged more oversight from regulators, including the obligation to audit and transparency for the providers of LLMs.

Hosseini who is an assistant editor of Journal Accountability in Research, which is published by Taylor & Francis, suggests that educating LLMs on the literature of science in particular disciplines might be a option to improve the quality and importance of their work to scientists. However, there was no publisher who were contacted by Nature claimed to be conducting this.

If scholars begin to rely on LLMs Another issue is that their communication abilities could diminish, according to Gemma Derrick, who studies research policy and culture at the University of Bristol, UK. Researchers who are in their early stages could be missing out on the opportunity to develop the skills needed to conduct balanced and fair reviews, she says.

Transformational change

Additionally the more broadly, generative AI tools could be used to alter the way researchers publish their research and how it is distributed as per Patrick Mineault, a senior machine-learning scientist at Mila — Quebec AI Institute in Montreal, Canada. It might mean studies can appear in a manner that is easily readable by machines instead of humans. “There will be a myriad of different ways to publish,” says Mineault.

In the time of LLMs, Eisen pictures a future where results are released in a dynamic “paper upon demand” format instead of as a static, standard-sized product. In this type of model, students might use an intelligent AI tool to inquire regarding the data, experiments and analysis, which will allow them to dig into the elements of a study which are relevant to them. Additionally, it would allow users to read a summary of the findings customized to meet their specific needs. “I believe it’s just an issue of time before we are no longer using single narratives to serve as the connection between people and outcomes of scientific research,” says Eisen.

Companies like scite as well as Elicit have already introduced search engines that make use of LLMs to give researchers natural-language solutions to their queries. In August of this year, Elsevier launched a pilot version of Scopus AI, its own tool Scopus AI, to give brief overviews of research-related topics. In general, these tools employ LLMs to modify results that are returned from traditional searches.

Mineault says that the use of generative AI tools may change the way researchers conduct reviews and meta-analysesbut only when the tendency of these tools to rewrite references and information is addressed appropriately. The most comprehensive review written by humans that Mineault has reviewed contained around 1,600 pages However, utilizing an algorithm that generates AI could push it more. “That’s only a tiny fraction of all the science literature” Mineault says. “The issue is how much information is there within the literature of science today that can be exploited?”