Massive study detects AI fingerprints in millions of scientific papers

21 points by pseudolus 13 hours ago

This is not surprising. People are going to seek the path of least resistance. I suspect many non english speaking academics can now get publish under their own name using LLM's as translators. My question is: Are LLMs making the papers easier to read? Some authors make it point to say what they want to say in the least understandable manner. That's been the past. Are things changing?

unsupp0rted 6 hours ago

If you’ve ever read a paper by a non-English speaker who didn’t get it properly translated, then, err, LLMs might not make it better but they won’t make it worse.

physicsguy 9 hours ago

I'm no longer in academia but I was asked to review a paper recently which was similar to one of my final papers, so I accepted.

The paper was written by four Chinese authors who had previously collaborated together. It was one of the worst papers I've ever seen. It went over the same ground as previous studies so in that sense there was nothing new, but more bizarrely it did things like state well known equations incorrectly (most notably one of Maxwell's equations) incorrectly and drifted off onto total tangents about how one might calculate certain things in an idealised situation that had no relevance to the method that the authors used. My assumption is that they had generated the methods section using generative AI and not gone over it with even a cursory check.

But the worst part is that I recommended rejecting it outright, and the journal just sent it on to it's slightly less prestigious sister journal rather than doing so.

alganet 11 hours ago

I decided to look at it more closely.

It is massive, lots of papers analyzed, but just the Abstract part of each.

One way you could look at it is: some authors used an LLM to create the abstract from the paper contents. "Hey, there's a lot of new books with AI covers"

One other way is: There could be a correlation between LLM usage in the abstract and LLM usage during the production or writing of the paper. "Hey, I wonder if this book with an AI cover also was written by an AI. It should be investigated"

0xfaded 10 hours ago

This reminds me of auto-generated got commit messages. I can't fathom that someone would go to the effort of authoring a PR,and then not bothering to describe what they did. Unless of course they didn't actually go through the effort of authoring the PR, and may not even be fully aware of what's in there. I've stopped giving thorough code reviews to coworkers who can't use code generation responsibility, and often times they haven't reviewed the code themselves. Heck, I've been giving my PRs self-reviews since long before AI.
- alganet 8 hours ago
  
  As I mentioned, there are two ways of seeing it.
  Maybe there was some kind of pressure for LLM usage in the Abstract section.
  In certain points of my career, I have been pressure before to generate commits with specific rigid requirements.
  Perhaps we should look into the kinds of pressures that would force or push someone to LLM usage. What do you think?
- danielbln 9 hours ago
  
  I don't get this take. If I work through a feature with an LLM and then task it with creating conventional atomic commits to persist the work then that's no different to generating any other documentation. Same for the PR description. Now, you want to make sure it's not slop, purple prose or has any emojis, but other than that commits and PR descriptions are an entirely valid automation target.
  - alganet 8 hours ago
    
    I will look down on your auto-generated commits though, and be more critical of your PR if you use them by choice. Just like I would be naturally more inquisitve about the writing origin for a book that has an auto-generated image in the cover.
    Choose your analogy, the idea stays the same. There are two ways of seeing it, and it requires further investigation. Maybe, as I just suggested, read each book or review the entire PR.
    It is definitely a reason to consider slowing down the usage of automation, or at least scrutinizing it more.

orionsbelt 11 hours ago

While I am wary of AI being used to pump out crappy papers with bad science, I will say that many academics can be good in their fields while being quite bad at communicating clearly. I don't think it's a bad thing for a good scientist to use AI be used to take a genuine and scientifically interesting draft paper and use it to improve the writing so that it's clearer to the reader.

troyvit 10 hours ago

I just think that for the next few years at least there should be some sort of disclosure in how a paper used AI.

garylkz 10 hours ago

Curious, currently how is the use of AI being detected from papers?

From the article I saw that they're using "excess words" as an indicator, is that a reliable method?

Also, is it possible that it's just autocorrect that added "excess words" when fixing grammar? If that's the case, should that be considered as "use of AI"?

subscribed 6 hours ago

Bias against the non-native English users is staggering.
_According to the study, all seven AI detectors unanimously identified 18 of the 91 TOEFL student essays (19%) as AI-generated and a remarkable 89 of the 91 TOEFL essays (97%) were flagged by at least one of the detectors._
These papers were not written with use of AI.
https://hai.stanford.edu/news/ai-detectors-biased-against-no...
- umbra07 5 hours ago
  
  This is from 2023. Is there an updated source proving that detectors in 2025 are also not accurate?

PeterStuer 7 hours ago

From the study:

" while our approach can detect unexpected lexical changes, it cannot separate different causes behind those changes, like multiple emerging topics or multiple emerging writing style changes. For example, our approach cannot distinguish word frequency increase due to direct LLM usage from word frequency increase due to people adopting LLM-preferred words and borrowing them for their own writing. For spoken language, there is emerging evidence for such influence of LLMs on human language usage (32). However, we hypothesize that this effect is much smaller and much slower. "

Since "unexpected" can be left out as there is no "expected" baseline, we are left with "detected lexical changes, cause unknown" followed by a purely speculative and unproven hypothesis.

That of course does not stop them from claiming: "In conclusion, our work showed that the effect of LLM usage on scientific writing is truly unprecedented".

This is slop. Even if you agree with the suspicion of widespread use of LLMs in Academic writing (I do, and anecdotally suspect even their upper bound is way underestimated) They did not show such an effect at all. They used a projection on a dataset, hypothesized, but did not investigate a cause.

But hey, let not the absence of information restrain us from pushing the agenda: "Our analysis can inform the necessary debate around LLM policies providing a measurement method for LLM usage". In other words, here is what we did not do but you can pretend we did anyway in citing us as if we did because nobody reads anything but the abstract and the conclusion.

As a bonus kicker: "We hope that future work will meticulously delve into tracking LLM usage more accurately and assess which policy changes are crucial to tackle the intricate challenges posed by the rise of LLMs in scientific publishing." I'm sure this sentence would flag this as LLM slop, even according to their own sloppy methodology.