Wednesday, August 09, 2006

Literate Machines?

Literate Machines? Plagiarism Detection Software

Presented August 3, 2006 (Repeated August 4) at the SIDLIT Conference
KU Edwards Campus, Overland Park, KS

Perhaps one of the most well-known technologies that some use to interact with language is plagiarism detection software. I want to look at the benefits of this software for literacy, its limitations, and the pedagogical and ethical issues that this literacy raises.

BENEFITS

We're all familiar with names such as Turnitin.com, MyDropbox.com, and Eve. Renoir Gather of the University of Michigan is one of several folks who has created a bibliography of these services. Gather's Resources for Instructors: Plagiarism Detection Services includes several types of these "literate machines." The page was last updated in 2004, so the prices may not be current, but the page is still instructional for the information about the range of approaches that these services use.

Individual instructors and institutions can benefit from the publicity attached to these commercial products. I think of this as the "ADT" sign syndrome. The sign itself will deter some thieves. In fact, a statement of use of a high profile tool such as Turnitin will deter not only student academic thieves, but critical patrons. Claiming use of a plagiarism detection service is a sort of inoculation in itself.

A second benefit of these commercial services is that they tap into multiple databases. Turnitin, for example, states that its databases consist of three sources: 4.5 billion entries from the Internet, extensive sources from Proquest, and 10 million student papers that had previously been submitted to Turnitin.

A third, and the most compelling one to me, is that this attention to source-use in writing can help students learn to incorporate sources reliably. Nick Carbone, a writing consultant at Bedford/St.Martin's and a vocal opponent of Turnitin's use of student documents, concedes that he thinks this use of plagiarism detection software is a good one: "The strategy of using [plagiarism detection software] as a tool [my emphasis] to make sources visible is a good one...." His reasoning gets to the nexus between this technology and student literacy: "Sometimes students do...lose track of what words are theirs..., especially when they are moving beyond [patchwriting], and they try to find their own voice, and thus use more summary and paraphrasing" (16 May 2006 posting). "Patchwriting," by the way, is a term that is usually credited to Rebecca Moore Howard. This is writing that, while not quoted precisely, is too close to the origin because, instead of re-wording, novice writers merely delete some words, alter a few structures of sentences, and perhaps substitute words on a piecemeal basis rather than working to understand and re-state the concept. Students in the process of learning how to incorporate external sources often have difficulty managing all the tasks surrounding that activity. Carbone is suggesting that the visual aid that some of these service offer, especially when instructors and writing-center tutors use these screens as teaching opportunities, can help students think about the process of incorporating these sources into their own writing.

LIMITATIONS

While these strengths of plagiarism detection services are valuable, these services also have limitations. Often these are more a factor of the users of these services than of the services themselves.

For example, these services provide reports, not evaluations. Unfortunately, the ill-informed and/or the lackadaisical may interpret raw data as a death sentence.

What is in the report is limited to what is in the databases. Those who rely of these services as their sole detection device may get a false negative. Moreover, many other aspects of student writing aren't distinguishable as original or, to use Turnitin's euphemism, "unoriginal." For example, machines can't distinguish re-use of a writer's own language. For example, I've talked about plagiarism before, and I wouldn't be surprised if some of my phrases from materials that are on the web are very similar to topics and patterns that I'm using here. And speaking of patterns, students learning to write in some disciplines need to follow a standard template for some types of papers. I'm thinking of scientific reports or reviews of literature, for example. I would be concerned that this technology would flag papers that are successfully following conventions. And, of course, these services cannot identify intention--why plagiarism occurred. Plagiarism as a result of deception or laziness seems to be far different from that caused by a serious effort confounded by lack of savvy at managing a documentation system. Neither can these services identify what some call "social plagiarism"--"unoriginal" documents generated originally by Mommies, or friends, or siblings.

And, finally, these services cannot identify original documents written by commercial services. One of these services is ProfEssays.com. As Prof Essay proudly proclaims on its gateway page, "Plagiarism, copy/paste or paraphrasing is not tolerated at ProfEssays·com in any form....Each completed custom essay automatically goes through the anti-plagiarism software. After passing this first security level, the essay goes to the Expert Service Department to be manually checked for plagiarism by our Expert Team. As a result of such scrutiny, all custom essays you receive from ProfEssay.com are 100% authentic."

Authentic. In other words, the plagiarizers are using the detection software to establish their credibility. That behavior reminds us that these services are limited to detection. As Kelly Ritter, Associate Professor of English as Southern Connecticut State University commented, "Isn't this how we commonly talk about plagiarism--as a detection game, aided by software and other trickery--and nothing more?” In the July 28, 2006 posting on WPA-L, Ritter continues, "…[I]f we keep relying on these secondary mechanisms to do after-the-fact work, we should expect that sites like the paper mills will use those mechanisms against us.”

ANALOGY

The situation with plagiarism detection systems, then, is somewhat analogous to our use of antibiotics. I think of the two in relation to each other because both have the potential to do good, when in expert hands. But in the wrong--or corrupt, or impatient, or ill-informed--hands, the damage done can be greater than the good. We all know of the damage that has been done to public health by over-prescription, or under-use, or mis-use. What are comparable challenges to literacy prompted by these detection services?

MAJOR ISSUES

I'm concerned about pedagogical issues and even more so about ethical matters that affect students.

My pedagogical concerns relate to both faculty and students. I'm concerned that exasperated and ill-informed faculty will use the reports as evaluations and take action without informing themselves of the limitations of such programs. I'm also concerned that faculty will have a false sense of security from such services, opening the door for wealthy students to purchase original papers at a premium.

I am troubled that these services send the wrong messages to students. In fact, I've seen this happen. A high school student proudly commented that at his school all papers went through Turnitin, so "you just cannot cheat." In other words, he was relying on the technology instead of learning the process himself. At JCCC, an outstanding writer and extremely intelligent student submitted a major research paper without a single quotation. She said that she had been challenged before when she used quoted material, so she "didn't want to take the risk anymore." Two students commented this last semester that they would like to have drawn in more quoted material, but they knew they "shouldn't have more than 10% quoted material." In fact, for their critiques of a piece of literature, they could easily have had much more. I was baffled at how they arrived at the arbitrary 10% number until I heard a high school instructor who was presenting a session on his use of Turnitin in all his classes. He zeroed in on percentages, giving that arbitrary 10% as the maximum acceptable. And most seriously, I'm concerned that the fear of detection will discourage students from testing their skills, taking risks, and learning.

Of even greater concern are ethical issues. I am bothered that detection services assume guilt, and I am even more troubled that that does not seem to bother students. Equally troubling is the fact that Turnitin, and perhaps other services, usurp the copyright of student papers. I'm not sure what the current situation is, but when I contacted Turnitin two years ago, I was told that teachers--not students--could opt out. That is contrary to standards in my field, at least, regarding ethical treatment of students and their texts as stated in the College Composition and Communication Conference Guidelines for the Ethical Treatment of Students and Student Writing in Composition Studies. And I am doubly troubled in this situation when I realize that those papers are being used to enhance at least one service's capital. As one of my student's said after studying Turnitin for a research project, "How different is what they are doing from what they are accusing us of doing? Aren't they taking our papers without permission and using them for their gain?" And aren't we, as instructors and administrators being complicit if we don't harness these machines to facilitate instruction?

WHAT TO DO

What can we do then?

First, we can develop institution wide responses to issues surrounding academic integrity. Notice that I'm choosing to address the positive--academic integrity. And when I call for this, I am not encouraging rigid, blanket "no-tolerance" policies to violations. Rather, there needs to be evidence of institutional commitment and support with departmental autonomy respected. Institutions need to establish principles; departments establish policies; and teachers execute practices. That way, the various discipline-specific issues surrounding instruction, and writing genres can be addressed by the experts from the field of study. One of those considerations is curricular: Have students had the prerequisite learning to write researched papers in classes in a particular department?

Second, if a school adopts institution-wide use of a detection service, is there appropriate notification to students of this practice, and do they have a respected, viable alternative if they elect not to participate?

And all throughout the process, cost effectiveness needs to be a consideration. Given that the Turnitin web site indicates that current plagiarism rates are 30-40%, is the money spent going to yield appropriate benefits, or could the money be spent for instruction and support rather than detection and punishment?

Those are a few institutional considerations, but the instruction staff also need support.

Have instructors learned to incorporate writing into their coursework, and to assess student attempts at academic writing? Are they aware of the prerequisite skills required for the writing assignments they give? Do courses list the necessary prerequisites? Have instructors been given time and opportunity to learn to use the plagiarism detection software?

Has the writing-center specialists been made an integral part of the use of this software? Have they been allowed to use it for instruction?

Above all we can keep our focus on instruction and recognize plagiarism detection services for what they should be--a tool like a spell checker or a grammar checker that requires interpretation and that is but a small component of the writing efforts.

No comments: