A Fresh Look at Self-Plagiarism

Michele S. Garfinkel is Manager of the Science Policy Programme at EMBO, the European Molecular Biology Organization

[The opinions expressed in this article are those of the author and not necessarily of EMBO.]

As contributors to the scholarly literature, researchers are aware of the care that must be taken in describing not only their experimental results, but as well prior works, both of others and of themselves. In the peer-reviewed literature, a critical mechanism for ensuring the reader has a full understanding of the work under discussion is the proper use of citations.

Plagiarism, specifically the use of others’ words without proper citation, is probably the most recognizable lapse in publication ethics, and certainly fits any scholarly definition of theft. Journal editors now have access to sophisticated software for scanning articles for plagiarized text, and thus such theft can usually be easily recognized. But one type of text re-use that would be obviously anticipated but much more difficult to deal with is self-plagiarism. What happens when words are, at least nominally, our own? Can we re-use them, and how? (see generally, Akst, 2010; COPE, ND)

It is frequently argued that self-plagiarism is definitionally impossible, as one cannot steal from oneself. This may be partially an issue of word choice. Approaching the problem as “recycling” rather than “plagiarism” (Silverman 2012), while helpful from a policy perspective, does not address the immediate concerns. Some note that the concerns are far overblown, and perhaps wrapped up in agendas unrelated to advancing knowledge (Callahan, 2014). Perhaps this is true, but while concerns about plagiarism may be primarily about stealing, in self-plagiarism the real concern would appear to be about misrepresentation. This can be complicated by an obvious fact of modern science: very few research projects are carried out by a single individual, and thus very few papers are single author. If I am one author on a multi-author paper, could I possibly re-use any of it without crediting all the other authors? This does point toward one of the problems with the idea of self-plagiarism as definitionally impossible: who is the “self” here?

Practically, because the definition of self-plagiarism remains unclear, many researchers are simply unaware of the implications of it (this seems to be particularly pronounced when writing review articles, see, e.g., COPE 2009). In fields such as molecular biology and biochemistry, it is not unreasonable that whole sections of papers (particularly the Materials & Methods, but as well the Introductions) could be repeated verbatim with no intent to deceive. If the previous work was mine and this work is mine, then obviously I take responsibility for it. Some journals approach this in a clear way in their instructions to authors: re-use whatever you want, but cite it clearly. Some go a bit further and ask for any replication to be minimized, while the passage is still cited (Culley 2014).

Another way to ask whether self-plagiarism is theft, or otherwise damaging, is to look at it from the perspective of novelty, which is among the most important attributes of published research (though there are alternatives; for example, the open access journal PLoSONE, which, following peer review, will publish any research article that meets the journal’s technical and ethical standards irrespective of novelty). Re-using one’s text without citation in essence marks it as novel, and thus credited an additional time to the author as a contribution to the field. Again, for a biochemical Methods section, this might not be a pressing concern. But the interpretation of data (the Results and Discussion sections) should, in most cases, indicate novelty. It is difficult to reconcile self-plagiarism in that case.

Even if researchers wanted to reject the notion that they need to cite themselves properly, ultimately the law (or more specifically, rules around copyright) might have a say in the necessity. The simplest case would be the one where a journal holds copyright over an article. In that situation, extensive re-use of any text beyond that permitted under fair use rules, even one’s own text, would be a copyright violation and subject to any jurisdictional penalties, if the journals cared to enforce it. Even this situation is changing along with other aspects of access to journal articles. Many authors now are choosing to publish under copyrights that allow almost any re-use of articles, as long as the article is cited. It is imaginable that this type of copyright could actually be used to enforce self-citation where it otherwise might not occur. It is important to note that this is an area of particular concern to librarians (Rosenzweig and Schnitzer, 2013).

Researchers and publishers are now also grappling with the concept of self-plagiarism beyond re-use of words. As researchers, we may study the same problem for decades. This may lead to an accumulation not only of text but of data. Interestingly, until recently, the re-use of someone else’s data (and perhaps one’s own) fell into a general category of plagiarism, or at least of questionable conduct of research. Now, in many fields, the publication of open datasets specifically for the purpose of re-using them in novel ways is encouraged, and in fact have gone in the exact opposite direction of text: rather than insisting on more citation, many journals and funders now insist that data be published with a so-called “CC-0” license, meaning that the data are in the public domain. No accusation of (self)-plagiarism can be attached to the use of such data. To be clear, there is as well a somewhat complex relationship between data and figures. The re-use of figures must be properly credited.

For text then, we can certainly understand, and mandate, the need for citation of the original source. For data, many fields are moving toward licensing data in such a way that they are usable by anyone, and frequently without attribution. But for everything else, there are ongoing discussions as to what constitutes outright self-plagiarism, and, more urgently, what policy approaches the community might take to resolve the tensions in different views in order to benefit research. A particularly contentious area is that of “thought plagiarism.” which manifests itself in self-plagiarism as “salami slicing.” The grouping of research data into “smallest publishable units” is done constantly, but also critiqued constantly. A main argument against it is that it is bad science. But that would not necessarily be clear by looking at the literature. The more specific argument is that salami-slicing is in fact a kind of self-plagiarism, where the idea, rather than the text necessarily, is repeated. It is clear that many stakeholders (the researchers themselves, and their peers; funders; journal editors; librarians) have very strong views on plagiarizing though, and salami slicing particularly. Interestingly, there has been little policy analysis of the benefits and harms of (in the more positive view) “thought sharing,” which would be a useful complement to general discussions of openness in science. If self-plagiarism is a problem to be solved, the low-hanging fruit is for the relevant stakeholders (journal editors and probably funders) to clarify that it is fine to use text (and even in some cases, ideas) from previous papers, but that these must be cited properly

More generally though, plagiarism, but especially self-plagiarism, are frequently covered quickly, if at all, in at least some research ethics training programs. There are some sources of information regarding responsible conduct of research, including for defining and preventing plagiarism and self-plagiarism (perhaps the most useful of these being an annotated list of guidelines, Roig, ND). In general, in thinking about training, are these materials clear? Do they address research and publication as they are evolving? More important, who is responsible for conveying this information? Journals (both through their instructions to authors and the review process) frequently become the last chance to catch misunderstandings about self-plagiarism. Is this where the responsibility should be situated? There are many possible approaches to this type of training and oversight, and discussions about them are certainly underway, but as yet with few clearly defined options.

For example, as other types of self-publishing (especially blogs) have become more widespread, it has also become complicated to define what is a “published unit” and thus what constitutes self-plagiarism. Is a blog post more like a pre-print server, or is it more like a record copy of thoughts at that time? This problem, though, has been raised even for that most academic of works, the dissertation (Spinak, 2013). Given the ambiguity and lack of agreement in research communities as to how to treat such re-uses, at least some publishers, rather than retracting, are flagging articles with corrigenda that they find later to have emerged from earlier online posts (see, e.g., EMBO reports 2011). These approaches are neither right nor wrong, but would benefit from more serious analysis. Thus, there is a significant amount work that could be done in the policy community with scientists, librarians, funders, journals, and lawyers to understand further the full scope of self-plagiarism; to define the real harms (and, in some cases, benefits) of self-plagiarism; and, not least, to understand fully the responsibilities of all parties in the research system with respect to mitigating any of those harms.

References

Akst J. 2010. When is self-plagiarism ok? The Scientist 9 September (online entry: http://www.the-scientist.com/?articles.view/articleNo/29245/title/When-is-self-plagiarism-ok-/ (accessed 28 November 2014).

Callahan JL. 2014. Creation of a moral panic? Self-plagiarism and the academy. Human Resource Development Review 13: 3-10.

COPE (Committee on Publication Ethics). 2009. Case 09-21: Self plagiarism. http://publicationethics.org/case/self-plagiarism (accessed 27 November 2014).

COPE (Committee on Publication Ethics). ND. Text Recycling Guidelines: for commentary from BioMedCentral editors. http://publicationethics.org/text-recycling-guidelines (accessed 28 November 2014).

Culley TM. 2014. APPS’s stance on self-plagiarism: Just say no. Applications in Plant Sciences 2(7): 1400055 (doi: http://dx.doi.org/10.3732/apps.1400055).

EMBO reports. 2011. Corrigenda: What about ‘information’, and ‘On science and philosophy’, p. 283.

Roig M. ND (text-only site updated 2013). Avoiding plagiarism, self-plagiarism, and other questionable writing practices: A guide to ethical writing. U.S. Department of Health and Human Services, Office of Research Integrity. http://ori.hhs.gov/sites/default/files/plagiarism.pdf (accessed 27 November 2014).

Rosenzweig M, Schnitzer AE. 2013. Self-plagiarism: Perspectives for librarians. C&RL (College & Research Libraries) News 74: 492-494.

Silverman C. 2012. I think we should all stop using the term “self-plagiarism.” Poynter (on-line), http://www.poynter.org/news/mediawire/177959/corbett-i-think-we-should-all-stop-using-the-term-self-plagiarism/ (accessed 28 November 2014).

Spinak E. 2013. Ethical editing practices and the problem of self-plagiarism. SciELO in Perspective. http://blog.scielo.org/en/2013/11/11/ethical-editing-practices-and-the-problem-of-self-plagiarism/#.VH185kvAk0M (accessed 28 November 2014).


This article is part of the Fall 2014 issue of Professional Ethics Report (PER). PER, which has been in publication since 1988, reports on news and events, programs and activities, and resources related to professional ethics issues, with a particular focus on those professions whose members are engaged in scientific research and its applications.

Teaser image credit--Flickr: Bastispicks, Typing