Much like the world of college essays, scientific journals are often plagued with authors trying to publish someone else’s work as their own.
For Oklahoma Medical Research Foundation scientist Jonathan Wren, Ph.D., the issue hit home as part of his duties as an editor for the journal Bioinformatics, when a reviewer recognized a paper as having been published in another journal. Today, Wren has transformed that experience into a tool for sniffing out research retreads.
“I got to thinking about the nature of plagiarism and what could be done about it in the context of science,” said Wren, who specializes in bioinformatics and data analysis.
Wren knew that this former mentor, University of Texas Southwestern professor Harold Garner, Ph.D., had created a program called eTBLAST. Garner developed the program to identify other researchers working on similar topics. But, said Wren, “After he developed eTBLAST, he used it to select student papers and make sure they weren’t plagiarized.”
In light of his experience with the journal, Wren suggested a wider reach for the program: putting it to use at scientific journals to detect plagiarism among researchers. Wren saw a natural testing ground, a place where plagiarism could essentially hide in plain sight—Medline, the National Library of Medicine’s database of more than 10 million scientific articles.
“Rather than just providing answers about whether one paper is plagiarized, eTBLAST could put us on the path of asking some more interesting questions about plagiarism,” Wren said. “Who does it? Is it something many do occasionally or a few do frequently? Do people plagiarize more at some point in their careers, such as before tenure reviews or grant deadlines? Is plagiarism on the rise?”
The answer was shocking: In a wide-ranging scan of scientific periodicals, Wren and Garner used eTBLAST and found that more than 70,000 research articles and abstracts bore multiple similarities to other published work. The results of their study appear in the current issue of the journal Nature.
According to Wren, eTBLAST uses a more advanced algorithm than plagiarism detectors used by universities. “It not only looks for blocks of similar text, but it also examines sentence structure, looking at things like associations and proximity with other words,” the OMRF researcher said.
As powerful as eTBLAST is, Wren emphasized that it is not a judge and jury. “Any papers the program identifies still need to be examined by human eyes to determine if the plagiarism is real or just a coincidence,” he said.