Archive for January, 2009

Method and Instruction: Treating Teachers As Professionals

January 27, 2009

My interest was sparked during last week’s reading class that ended with a letter exercise that was part of the Saint-Aubin and Klein (2008) study. My experience with identifying both the and various t usages in a passage that could only be read once and also had to be read for comprehension, was interesting, to say the least.  With the basic instructions to read for both identification and comprehension, as soon as the exercise began, my mind went to a place where I perceived myself to be in a competition – to see who could finish the passage most efficiently, although these were not the instructions.  Nevertheless, I tried to process for both the letter and comprehension as quickly as I could and was shocked that I identified zero the words and only two of the t words identified on the response sheet we were given after the test.  Much of my focus was on finding the t in words where the letter wasn’t readily apparent, like in the middle or at the end of the word.  I never thought to consider a written numeral as containing the letter t, because I interpreted the instructions very literally – I was hunting for the letter t, like some adult-run-amok on Sesame Street. 

When I checked my answers, I was surprised that not every word containing a t was included in the calculation – and that many of the words containing t that I identified were not considered at all, with the focus being on the words and eight additional t words.  I wonder if the students in the study were aware of their results like I was?  How would they interpret the instructions?  After reading their study of 180 1st through 5th grade students, as well as the conclusion that good readers had a greater missing-letter effect than poor readers in the first four grades, my results did not seem so surprising.  It was also not surprising that the majority of good readers will read over function words, effectively skipping them, because they are common.  In fact, my results show zero function word identification and ¼ content word identification, confirming, I believe, their results. 

Though many other studies have been done on the missing-letter effect, as shown in Saint-Aubin and Klein’s research, they obviously thought they needed to do their own study in light of what they perceived to be flaws in previous research that neglected to formally gauge reading ability by using a consistent measurement instrument, in their case the WRAT-3.  I did find it odd that the majority of their students were labeled as poor readers, a fact the authors attribute to their geographical area of study, New Brunswick, Canada.  Would the study replicated in higher SES schools show similar results?  Would those students have a greater missing-letter effect because high SES schools tend to have higher test scores?  Would the study have yielded different results if the students in the study were more varied, not from poor urban or rural areas?

Actually, my reading for this week began with the Juel and Minden-Capp (2000) study looking at how students learn to read and how teachers teach them.   I thought it would be a good place to begin as it seemed to offer a more extensive background, especially given the gap in my own education concerning reading pedagogy (my background is in writing).  What caught my attention immediately was their acknowledgement of the complexity of the classroom, a factor that many fail to consider.  In our age of standardization, politicians attempt to solve the ills of the education system with blanket reforms.  My school district, as a result, has decided that every 9th, 10th, and 11th grade student must take the same year-long classes taught by teachers who must follow pacing guides, teaching the same things on the same days at times, their autonomy and creativity as professionals undermined in many ways. After reading Juel and Minden-Capp’s background section that included several informative pages about linguistic units – the ways we can break down how students acquire language skills – the following section on instructional strategies drew me in. 

It seems that every education-related article, study, or book I have read in the last several weeks has focused on the importance of the art of teaching.  Kozol (2005) talks about the “withitness” factor, that points to teachers as making the biggest difference in student learning, regardless of curriculum or environment.  A 33-page study released this month by Gary Orfield, “Reviving the Goal of an Integrated Society,” discusses not only the lack of qualified teachers in many poor schools, and the fact that the best teachers – the ones most needed in struggling schools – flock to successful suburban schools, where, oftentimes, the pay is higher and the working conditions are better than in their urban counterparts.  In a recent piece in The New Yorker, Malcolm Gladwell ponders the difficulty of the process of finding the right teacher for the job, citing Stanford’s Eric Hanushek, who asserts that students who have successful teachers learn three times as much as those with poor ones.  Even Bill Gates’ 2009 Annual Letter, released on Monday, addresses the state of education in the United States and calls for researching effective teaching methods, realizing that expert teachers are doing things in their classrooms that are driving student success better than recent efforts at establishing smaller learning communities or standardized curricula.

Somewhere along the way, someone decided that teachers couldn’t be trusted to teach.  I like to call administrative efforts “teacher proofing,” because those in charge try to prepackage learning without regard to the essential role a teacher plays in the classroom.  No wonder so many young teachers are dropping out of the profession before they’ve taught for five years.  Who wants to work hard to earn not only a degree, but also a teaching certificate, only to find that you are not afforded the status of esteemed professional and that your employer does not think you have the mental capacity to actually teach?  When I read the Final Thoughts section of the Juel & Minden-Cupp study, after looking at their study that spanned the classrooms of four different teachers teaching the same concepts very differently at times, and all showing success over time, I was elated to see their observation that effective teaching instruction cannot be pigeonholed.  There is no one good way to teach. Imagine that! They end by agreeing with Duffy and Hoffman (1999) that “improved reading is linked to teachers who use methods thoughtfully, not methods alone” (p. 15).  Well, being in complete agreement with that, I had to pull and print the Duffy and Hoffman piece, which served to reaffirm my belief that yes, indeed, there is an art to teaching well.

In “In pursuit of an illusion: The flawed search for a perfect method,” Duffy and Hoffman maintain that teaching reading successfully, though frequently reduced to a debate over method – the how-to – is actually contingent upon the teacher’s style.  They address prescriptive, pre-packaged curricula – and the pursuit of finding that one magic method of instruction – and how such a mentality is preventing us from really working to improve reading instruction.  While their argument is specifically tied to reading, I argue that we can easily apply it to writing, or any discipline outside of the English Language Arts.  They’re right.  So many researchers are focused on one particular method, and even go so far as to create a competitive environment by researching the effectiveness of one method over another, that they’re losing sight of what really matters – teaching kids successfully.  Their assumption is that those prescribing curricula assume low teacher intelligence and that any warm body should be able to come into a classroom, read from the script, and teach the children. 

But that isn’t the case.  Why not?  Education is political.  Those who dictate its function and structure know the power a good teacher can have.  Usurping that power by prescription and standardization attempts to keep the power in their hands, while they and others continue to blame teachers for the state of education in the country.  It’s a convenient ploy.  When I think about countries in turmoil, some of the first radical responses include attacking the education system.  The Nazis got rid of the teachers and professors.  In some Middle Eastern countries, teachers are murdered, schools are bombed, and girls are prevented from learning.  Why?  Because knowledge is power.  We call our schools equal, yet our inner city schools look nothing like our suburban ones.  Is this equality?  Do we have equal education and opportunity when students in poor schools have a lower success rate in our schools?  Absolutely not. Schools are microcosms of our society – not every strategy works with every student.  Students come to us from so many different backgrounds and experiences that one cannot reasonably expect that they will all learn the same way.  Similarly, teachers are not uniform.  What is important is that we acknowledge this complexity not only in reading acquisition, but in our educational philosophies and practices.

 

References 

Duffy, G.G. & Hoffman, J.V. (1999). In pursuit of an illusion: The flawed search for a perfect method. The Reading Teacher, 53(1), 10-16.  Retrieved January 26, 2009, from Research Library database. (Document ID: 44718847).

Gates, B. (2009).  2009 Annual Letter. The Bill and Melinda Gates Foundation. Retrieved January 26, 2009, from www.gatesfoundation.org. 

Gladwell, M. (2008). Most likely to succeed. The New Yorker, December 5, 2008. 36-42.

Juel, C. & Minden-Cupp, C. (2000). Learning to Read Words:  Linguistic Units and Instructional Strategies. Reading Research Quarterly, 35, 458-492.

Kozol, J. (2005). The Shame of the Nation: The Restoration of Apartheid Schooling in America. New York, NY: Three Rivers Press.

Orfield, G. (2009). Reviving the Goal of an Integrated Society: A 21st Century Challenge. Los Angeles, CA: The Civil Rights Project/Proyecto Derechos Civiles at UCLA.

Saint-Aubin, J. & Klein, R. (2008). The influence of reading skills on the missing-letter effect among elementary school students. Reading Research Quarterly, 43 (2), 148-164.

On Social Justice and Education

January 25, 2009

Every morning, diverse groups of American students – rich and poor, black and white, rural and urban, gay and straight — begin the school day by rising, facing the flag, and pledging allegiance to a country that claims to be indivisible, ensuring liberty and justice for all. Students learn about the core democratic value of equality, which dictates that Americans have the basic right of equal treatment regardless of background, belief, economic status, race, religion, or sex. In addition, they learn about the core democratic value of justice, a fundamental belief that American society offers the same benefits and has the same obligations to all of its citizens. While both of these values teach students that individuals and groups are not favored over other individuals or groups, we need not look further than the very system that champions these tenets of social justice, the American education system, to recognize that disparate inequalities not only exist, but continue to be perpetuated.

In 21st century America, there is an ever-widening gap between the haves and the have-nots, a situation exacerbated by a combination of the deregulation and trickle-down economic policies instituted by the Reagan administration, the anti-worker legislation enacted during the Clinton administration, as well as the continued support of  big business and tax breaks for the rich to the detriment of the middle and lower classes by the Bush II administration. Where America should have been moving toward an enlightened society, its policies have ushered in a New Feudalism where other industrialized nations have instituted much more progressive policies. Social justice in education implies that all students have equal education opportunities. My experience as a classroom teacher tells me this is not the case. In a school that has a distinct socioeconomic disparity, where a contingent arrives at school each morning driving new cars – it’s not uncommon to see a BMW or new SUV in the student lot – and another arrives by bus, students are offered the same curricular and extra-curricular choices, but socioeconomic conditions – students’ access to money and their ability to pay for, among other things, Advanced Placement tests, ACT or SAT exams, private tutoring, pay-to-play sports, field trips, and yearbooks – highlight a very real distinction in opportunity. Very often, it is the low SES students who populate remedial and special education classes, while their counterparts schedule college preparatory classes. To say that every child has the same educational opportunities may be correct on paper, but this is not a reality.

But the inequality that exists in our education system should not surprise anyone.  For years, research has indicated that socioeconomics are a significant factor in educating. Cultural Psychologist Jerome Bruner, in The Culture of Education (1996), explores “the impact of poverty, racism, and alienation on the mental life and growth of [children]” (p. xiii) explaining that “effective education is always in jeopardy either in the culture at large or with constituencies more dedicated to maintaining a status quo than to fostering flexibility” (p. 15). And the status quo has always been driven by political motivations to perpetuate an underclass to support industry.

Throughout the 20th century, when the American economy relied on manufacturing, industry required workers who could perform simple, repetitive tasks requiring little or no education; it was not in manufacturers’ best interest to cultivate an educated work force. In contrast, the 21st century’s shift to an information-based economy necessitates a work force that is highly literate, technologically adept, and able to think critically to problem solve, goals that are not readily achieved by all students given the schools’ rampant inequities in funding, teacher preparedness, school environment, and a host of other issues. While the most successful schools spend more money per student than the least successful, a function of economic and geographical segregation, in the last eight years the federal government decided that it would rather mandate what it deemed to be a solution of higher standards for all students through No Child Left Behind’s unfunded legislation than address systemic issues like poverty and racism, which are at the heart of the problem.

In order to address these issues, we must ask ourselves if we are ready to change the status quo. Bruner points out, “Education is risky, for it fuels the sense of possibility. But a failure to equip minds with the skills for understanding and feeling and acting in the cultural world … risks creating alienation, defiance, and practical incompetence” (pp. 42-43).  

For much too long, the American education system has functioned as a divisive tool where the wealthy received a liberal education to perpetuate the ruling class, while the workers received enough education to make them functional components of industrial productivity. Inherent inequalities in schooling have persisted and have been maintained, because equal education for all Americans is a dangerous proposition that could very well upset the status quo.  Dewey believed that education was crucial to shaping a society, because intelligence, behavior, and knowledge can change (Fishman, 1998).  It is time to facilitate change in our schools to afford equality and justice for all.

 

References

Bruner, J. (1996). The Culture of Education. Cambridge, Massachusetts: Harvard University Press.

Fishman, S. M. (1998). Dewey’s Ideology and His Classroom Critics. In S.M. Fishman, & L. McCarthy, John Dewey and the Challenge of Classroom Practice (pp. 57 – 67). New York: Teachers College Press.

On Reading: Exploring Tensions in Theory and Method in the Culture of NCLB

January 20, 2009

As we approach the dawn of a new administration with hope for a brighter future given the shortcomings of the 2001 No Child Left Behind legislation, the connections across the texts I’ve read this week seem eerily appropriate as the history of both the teaching of reading (Allington & McGill-Franzen, 2000) and reading research and practice (Alexander & Fox, 2004) are illustrative of how socio-political factors influence both educational ideologies and methods.  Though Paris (2005) cautions that such political dictates have resulted in “a greater than ever reliance on scientific evidence to guide educational policies for assessment and instruction” (p. 184), Allington & McGill Frantz write of the influence of the scientific method on educational practice as early as the turn of the century (p. 6) that continued through the mid-century’s cold war battle for superiority that saw the emergence of behaviorist approaches to reading instruction (Alexander & Fox, pp. 34-37), followed by Chomsky’s linguistics-based language acquisition research in the mid 1960s to early 1970s (Alexander & Fox , pp. 37-38),  and schema theory rooted in cognitive psychology from the mid-1970s to mid-1980s,  revealing a century-long trend that has had both positive and negative effects on the ways we educate our young people. 

Research trends and change in instructional methods over time indicate a constant drive to improve the skills and experiences of students.  There is no doubt that good intentions persist, that educators and researchers have students’ best interests at heart, but, at the same time, one must acknowledge the “enormous political pressures and commercial interests [that have] made learning to read a contentious issue in the United States” (Paris, p. 184).  NCLB’s focus on assessment, according to Paris, is not only changing reading instruction by testing “constrained” areas such as “alphabet knowledge, phonemic awareness, and oral reading skills” (p. 187, pp. 200-201), but is resulting in an instructional shift to the detriment of other methodological foci, “creat[ing] a minimum competency approach to reading assessment that does not adequately assess children’s emerging use and control of literacy” (p. 201).  Current teachers are all too familiar with recent curricular mandates that, more often than not, drive them to teach to the test.  

As instruction becomes more automated, establishing totalitarian classrooms centered on rote memorization and devoid of innovation, there is no doubt that criticism of the educational system will continue.  Given Ron Wolk’s (1998) observation that “education policy is on a ‘collision course with reality’ “(qtd. In Allington & McGill-Franzen, p. 22), Allington & McGill-Franzen still argue that “expert teachers” have more influence on the quality and extent of learning than pre-fabricated curricula, citing Linda Darling-Hammond’s remarks from the National Commission on Teaching and America’s Future (p. 24).  What is important to remember, however, is that the state of our schools in the 21st century “will depend largely on whether policymakers decide to invest in fostering commitment to reform as opposed to trying to simply mandate it” (Allington & McGill-Franzen, p. 24). 

Central to arguments about education and, more specifically, reading instruction, is what W.E.B. DuBois identified as the fundamental “right to learn” and “the right to have examined in our schools not only what we believe, but what we do not believe”  (Darling-Hammond, 1996, p. 5), something that does not happen in schools that teach to the test – and more teaching to the test seems to happen in urban and poor schools who have the most to lose if their test scores do not skyrocket, a result of NCLB’s punitive measures.  In The Shame of the Nation (2005), Jonathan Kozol writes of administrative praise of automaton teachers who exhibit “managerial proficiency” (p. 72) in urban schools where reading teachers are told that “any digression from printed [lesson] plans could cause them problems” (p. 71).  It is as if there is a dichotomous education system in this country where upper class students receive a much different education than others, pointing to the political function of establishing and maintaining very different schools.  Darling-Hammond (1996) writes of “[f]actory model schools with highly developed tracking systems that stressed rote learning and unwavering compliance for the children of the poor … counterposed against small elite schools … that offered a stimulating curriculum, personalized attention, high-quality teaching, and a wealth of intellectual resources for an advantaged few” (p. 6) as representative of a past American system, but that many, like Kozol, argue still exists.  One need look no further than Charles Murray’s Real Education (2008), in which he argues that most Americans do not have the intellectual capacity to succeed in higher education, to understand the political implications of a segment of our society not believing in all students having the possibility to achieve – whether that is in reading, writing, mathematics, or in their collective education.  And this tension in our society is one that drives the dramatic shifts seen in the last century in the foundation of our theoretical and methodological emphases.  

Charles Murray and Alfie Kohn: Two Visions for Education in America

January 11, 2009

Charles Murray Revisited

About a week ago, the New York Times published seven letters to the editor in response to Charles Murray’s December 27th Op-Ed piece, “Should the Obama Generation Drop Out?” Above all, I was interested to see how America would react to his assertion that a large portion of the population should not be enrolled in our colleges and universities because they either don’t have the intellectual capacity or consider education merely a means to obtain skills in order to find a job, both issues he identifies as contributing to a decline in the caliber of students who are enrolled in his classes.

The main concern expressed by readers is that the changes Murray suggests would result in “institut[ing] a class system in the United States,” invariably restricting degrees to the wealthy and, to a smaller extent, token commoners “who possess[] exceptional intellectual endowments.” Although I agree with the letter writer, one would be remiss not to acknowledge bachelor degrees have become the new high school diploma, and a glass ceiling remains that serves to restrict the non-wealthy from graduate studies. For example, many graduate programs require a full time commitment, offering stipends that are little more than a pittance – and definitely not enough to sustain the status quo. My choices for graduate school were severely limited because I am a single mother who must retain employment as a full time teacher in order to pay the bills and put food on the table. Am I less intelligent than those who come from wealthy families and are able to commit to full time status to pursue scholarship? No. Economic realities force me to spread myself thin, take out student loans, and persevere toward my goal, ever slowly, while those of means are able to advance more quickly, unencumbered by financial issues.

Alfie Kohn and Standardized Testing

In a December 18th article in the USA Today newspaper, Alfie Kohn comments on education reform, providing suggestions for the incoming Obama administration. A well known critic of standardized testing, in his opposing views article, “Too much ‘reform’,” Kohn comments on the politicization of the reform movement, and is particularly concerned that the implication of more rigor and higher standards has resulted in misguided policies that have neither helped students learn or teachers teach.

Kohn maintains that “[the] accountability movement turns schools into test-prep factories,” noting that the systemic reforms instituted by the No Child Left Behind legislation have been disastrous for inner city schools and students as their focus has changed from learning to doing well on the standardized tests used to measure their progress – or failure.

Furthermore, he criticizes so-called education reformers “with the sensibility of corporate managers rather than educators” whose assumptions that anything “rigorous” must be better, and who “talk about ‘achievement’ and ‘world-class standards’ when all they mean are higher scores on fill-in-the-bubble exams.” Instead, what NCLB has done in the name of accountability is to “create[] pressure to ratchet up the least valuable forms of instruction.” Teaching test-taking skills and strategies only teaches students to become good test takers, not critical thinkers. Similar arguments appear in an article he wrote for the Nation, “Beware of School ‘Reformers’.”

As we prepare for changes with the incoming Obama administration, and his choice of Arne Duncan, a man who served as head of Chicago Public Schools and responsible for successful early education initiatives, for Secretary of Education, I wonder how we will step up to addressing deficits in our system.

The Past, Present, and Future of Automated Essay Scoring

January 10, 2009

“No sensible decision can be made any longer without taking into account not only the world as it is, but the world as it will be …” – Isaac Asimov (5)

Introduction

Although some realities of the classroom remain constant –they wouldn’t exist without the presence, whether actual or virtual, of students and teachers –the technology age is changing not only the way that we teach, but also how students learn. While the implications of this affect all disciplines, it is acutely evident in the teaching of writing. In the last twenty years, we have seen a rapid change in how we read, write, and process text. Compositionist Carl Whithaus maintains that “… writing is becoming an increasingly multimodal and multimedia activity” (xxvi). It is no surprise then, that there are currently 100 million blogs in existence worldwide and 171 billion email messages sent daily (Olson 23), and the trend toward digitally-based writing is also moving into the classroom. The typical student today writes “almost exclusively on a computer, typically one equipped with automated tools to help them spell, check grammar, and even choose the right words” (Cavanaugh 10). Furthermore, CCC notes that “[i]ncreasingly, classes and programs in writing require that students compose digitally” (785).

Given the effect of technology on writing and the current culture of high stakes testing ushered in by the mandates of the No Child Left Behind Act of 2001, a seemingly natural product of the combination of the two is computer-based assessment of writing. An idea still in its infancy, the process of technological change in combination with federal testing mandates has resulted in several states incorporating “computer-based testing into their writing assessments, … not only because of students’ widespread familiarity with computers, but also because of the demands of college and the workplace, where word-processing skills are a must” (Cavanaugh 10). Although it makes sense to have students accustomed to composing on computer write in the same mode for high-stakes tests, does it make sense to score their writing by computer as well? This is a controversial question that has both supporters and detractors. Supporters like Stan Jones, Indiana’s Commissioner of Higher Education, believe that computerized essay grading is inevitable (Hurwitz n.p.), while detractors, primarily pedagogues, assert that such assessment defies what we know about writing and its assessment, because “[r]egardless of the medium … all writing is social; accordingly, response to and evaluation of writing are human activities” (CCC 786).

Even so, the reality is that the law requires testing nationwide, and in all probability that mandate is not going to change anytime soon. With NCLB up for revision this year, even politicians like Sen. Edward Kennedy of Massachusetts agree that standards are a good idea and that testing is one way to ensure that they are met. At some point, we need to pull away from all-or-none polarization and create a new paradigm. The sooner we realize that “… computer technology will subsume assessment technology in some way” (Penrod 157), the sooner we will be able to address how we, as teachers of writing, can use technology effectively for assessment. In the past, Brian Huot notes that teachers’ responses have been reactionary, “cobbled together at the last minute in response to an outside call … ” (150). Teachers need to be proactive in addressing “… technological convergence in the composition classroom, [because if we don't], others can will impose certain technologies on our teaching” (Penrod 156). Instead of passively leaving the development of assessment software solely to programmers, teachers need to be actively involved with the process in order to ensure the application of sound pedagogy in its creation and application.

This essay will argue that automated essay scoring (AES) is an inevitability that provides many more positive possibilities than negative ones. While the research presented here spans K-16 education, this essay will primarily address its application in secondary environments, primarily focusing on high school juniors, a group currently consisting of approximately 4 million students in the United States, because this group represents the targeted population for secondary school high stakes testing in this country (U.S. Census Bureau). It will first present a brief history of AES, then explore the current state of AES, and finally consider the implications of AES for writing instruction and assessment in the future.

A Brief History of Computers and Assessment

The first time standardized objective testing in writing occurred was in 1916 at the University of Missouri as part of a Carnegie Foundation sponsored study (Savage 284). As the 20th century continued, these tests began to grow in popularity because of their efficiency and perceived reliability, and are the cornerstone of what Kathleen Blake Yancey describes as the “first wave” of writing assessment (484). To articulate the progression of composition assessment, Kathleen Blake Yancey identifies three distinct, yet overlapping, waves (483). The first wave, occurring approximately from 1950-1970, primarily focused on using objective (multiple choice) tests to assess writing simply because, as she quotes Michael Williams, they were the best response that could be “… tied to testing theory, to institutional need, to cost, and ultimately to efficiency” (Yancey 489).

During Yancey’s first wave of composition assessment, another wave was forming in the parallel universe of computer software design, where developers began to address the possibilities of not only programming computers to mimic the process of human reading, but ” … to emulate the value judgments that human readers make when they read student writing in the context of large scale assessment” (Herrington and Moran 482). Herrington and Moran identify The Analysis of Essays by Computer, a 1968 book by Ellis Page and Dieter Paulus, as one of the first composition studies books to address AES. Their goal was to “evaluate student writing as reliably as human readers, … [and] they attempted to identify computer-measurable text features that would correlate with the kinds of intrinsic features …that are the basis for human judgments …, [settling on] thirty quantifiable features, … [which included] essay length in words, average word length, amount and kind of punctuation, number of common words, and number of spelling errors” (Herrington and Moran 482). In their study, they found a high enough statistical correlation, .71, to support the use of the computer to score student writing. The authors note that the response of the composition community in 1968 to Page and Paulus’s book was one of indignation and uproar.

In 2007, not much has changed in terms of the composition community’s position regarding computer-based assessment of student writing. To many, it is something that is an unknown, mystifying Orwellian entity waiting in the shadows for the perfect moment to jump out and usurp teachers’ autonomy in the classroom. Nancy Patterson describes computerized writing assessment as “a horror story that may come sooner than we realize” (56). Furthermore, P.L. Thomas offers the following question and response: “How can a computer determine accuracy, originality, valuable elaboration, empty language, language maturity, and a long list of similar qualities that are central to assessing writing? Computers can’t. WE must ensure that the human element remains the dominant factor in the assessing of student writing” (29). Herrington and Moran make the issue a central one in the teaching of writing and have “… serious concerns about the potential effects of machine reading of student writing on our teaching, on our students’ learning, and therefore on the profession of English” (495). Finally, CCC definitively writes, “We oppose the use of machine-scored writing in the assessment of writing” (789). While the argument against AES is clear here, the responses appear to be based on a lack of understanding of the technology and an unwillingness to change. Instead of taking a reactionary position, it might be more constructive for teachers to assume the inevitability of computerized assessment technology – it is not going away — and to use that assumption as the basis for taking a proactive role in its implementation.

The Current Culture of High-Stakes Testing

At any given time in the United States, there are approximately 16 million 15-18 year-olds, the majority of whom receive a high school education (U.S. Census). Even when factoring in a maximum of 10 percent (1.6 million) who may drop out or otherwise not receive a diploma, there is a significant amount of students, 14-15 million, who are attending high school. The majority of these students are members of the public school system and as such must be tested annually according to NCLB, though the most significant focus group for high-stakes testing is 11th grade students.

Currently in Michigan, 95% of any given public high school’s junior population must sit for the MME, Michigan Merit Exam, in order for the school to qualify for AYP, Adequate Yearly Progress[1]. Interestingly, those students do not all have to pass currently, though by 2014 the government mandates a 100% passing rate, a number that most admit is an impossibility and will probably be addressed as the NCLB Act is up for review this year. In the past, as part of the previous 11th grade examination, the MEAP, Michigan Educational Assessment Program, required students to complete an essay response, which was assessed by a variety of people, mostly college students and retired teachers, for a minimal amount of money, usually in the $7.50 – $10.00 per hour range. As a side note, neighboring Ohio sends its writing test to North Carolina to be scored by workers receiving $9.50 per hour (Patterson 57), a wage that fast food employees make in some states. Because of this, it was consistently difficult for the state to assess these writings in a short period of time, causing huge delays in distributing the results of the exams back to the school districts, posing a huge problem as schools could not use the testing information in order to address educational shortfalls of their students or programs in a timely manner, one of the purposes behind getting prompt feedback.

This year (2007), as a result of increased graduation requirements and testing mandates driven by NCLB, the Michigan Department of Education began administering a new examination to 11th graders, the MME, an ACT fueled assessment, as ACT was awarded the testing contract. The MME is comprised of several sections and required most high schools to administer it over a period of 2-3 days. Day one consists of the ACT + Writing, a 3.5 hour test that includes an argumentative essay. Days two/three (depending on district implementation), consist of the ACT WorkKeys, a basic work skills test of math and English, further mathematics testing (to address curricular content not covered by the ACT + Writing), and a social studies test, which incorporates another essay that the state combines with the argumentative essay in the ACT + Writing in order to determine an overall writing score. Miraculously, under the auspices of ACT, students received their ACT + Writing scores in the mail approximately three weeks after testing, unlike the MEAP, where some schools did not receive test scores for six months. In 2005, a MEAP official admitted that the cost of scoring the writing assessment was forcing the state to go another route (Patterson 57), and now it has.

So how is this related to automated essay scoring? My hypothesis is that as states are required to test writing as part of NCLB, there is going to be a lack of qualified people to be able to read and assess student essays and determine results within a reasonable amount of time to purposefully inform necessary curricular and instructional change, which is supposed to be the point of testing in the first place. Four million plus essays to evaluate each year (sometimes more if more writing is required, like Michigan requiring two essays) on a national level is a huge amount. Michigan Virtual University’s Jamey Fitzpatrick says, “Let’s face it. It’s a very labor-intensive task to sit down and read essays” (Stover n.p.). Furthermore, it only makes sense that instead of states working on their own test management, they will contract state-wide testing to larger testing agencies, like Michigan and Illinois have with ACT, to reduce costs and improve efficiency. Because of the move to contract ACT, my guess is that we are moving in the direction of having all of these writings scored by computer. In email correspondence that I had with Harry Barfoot at Vantage Learning in early 2007, a company that creates and markets AES software, said, “Ed Roeber has been to visit us and he is the high stakes assessment guru in Michigan, and who was part of the MEAP 11th grade becoming an ACT test, which [Vantage] will end up being part of under the covers of ACT.” This indicates the inevitability of AES as part of high-stakes testing. In spite of the fact that there are no states that rely on computer assessment of writing yet, “… state education officials are looking at the potential of this technology to limit the need for costly human scorers – and reduce the time needed to grade tests and get them back in the hands of classroom teachers” (Stover n.p.). Because we live in an age where the budget axe frequently cuts funding to public education, it is in the interest of states to save money any way they can, and “[s]tates stand to save millions of dollars by adopting computerized writing assessment” (Patterson 56).

Although AES is not a reality yet, every indication is that we are moving toward it as a solution to the cost and efficiency issues of standardized testing. Herrington and Moran observe that “[p]ressures for common assessments across state public K-12 systems and higher education – both for placement and for proficiency testing – make attractive a machine that promises to assess the writing of large numbers of students in a fast and reliable way” (481). To date, one of the two readers (the other is still human) for the GMAT is e-Rater, an AES software program, and some universities are using Vantage’s WritePlacerPlus software in order to place first year university students (Herrington and Moran 480). However, one of the largest obstacles in bringing AES to K-12 is one of access. In order for students’ writing to be assessed electronically, it must be inputted electronically, meaning that every student will have to compose their essays via computer. Sean Cavanagh’s article of two months ago maintains that ACT has already suggested delivering computers to districts who do not have sufficient technology in order to accommodate technology differences (10). As of last month, March 2007, Indiana is the only state that relies on computer scoring of 11th grade essays for the state-mandated English examination (Stover n.p.) for 80 percent of their 60,000 11th graders (Associated Press), though their Assistant Superintendent for Assessment, Research, and Information, West Bruce, says that the state’s computer software assigns a confidence rating to each essay, where low confidence essays are referred to a human scorer (Stover n.p.). In addition, in 2005 West Virginia began using an AES program to grade 44,000 middle and high school writing samples from the state’s writing assessment (Stover n.p.). At present, only ten percent of states “…currently incorporate computers into their writing assessments, and two more [are] piloting such exams” (Cavanagh 10). As technology becomes more accessible for all public education students, the possibilities for not only computer-based assessment but also AES become very real.

Automated Essay Scoring

Weighing the technological possibilities against logistical considerations, however, when might we expect to see full-scale implementation of AES? Semire Dikli, a Ph.D. candidate from Florida State University, writes that “…for practical reasons the transition of large-scale writing assessment from paper to computer delivery will be a gradual one” (2). Similarly, Russell and Haney “… suspect that it will be some years before schools generally … develop the capacity to administer wide-ranging assessments via computer” (16 of 20). The natural extension of this, then, is that AES cannot happen on a large-scale until we are able to provide conditions that allow each student to compose essays via computer with Internet access to upload files. At issue as well is the reliability of the company contracted to do the assessing. A March 24, 2007 Steven Carter article in The Oregonian reports that access issues resulted in the state of Oregon canceling its contract with Vantage and signing a long-term contract with American Institutes for Research, the long-standing company that does NAEP testing. Even though the state tests only reading, science, and math this way (not writing), it nevertheless indicates that reliable access is an ongoing issue that must be resolved.

Presently, there are four commercially available AES systems: Project Essay Grade (Measurement, Inc.), Intelligent Essay Assessor (Pearson), Intellimetric (Vantage), and e-Rater (ETS) (Dikli 5). All of these incorporate the same process in the software, where “First, the developers identify relevant text features that can be extracted by computer (e.g., the similarity of the words used in an essay to the words used in high-scoring essays, the average word length, the frequency of grammatical errors, the number of words in the response). Next, they create a program to extract those features. Third, they combine the extracted features to form a score. And finally, they evaluate the machine scores empirically,”(Dikli 5).

At issue with the programming, however, is that “[t]he weighting of text features derived by an automated scoring system may not be the same as the one that would result from the judgments of writing experts” (Dikli 6). There is still a significant difference between “statistically optimal approaches” to measurement and scientific or educational approaches to measurement, where the aspects of writing that students need to focus on to improve their scores “are not the ones that writing experts most value” (Dikli 6). This is the tension that Diane Penrod addresses in Composition in Convergence that was mentioned earlier, in which she recommends that teachers and compositionists become proactive by getting involved in the creation of the software instead of leaving it exclusively to programmers. And this makes sense. Currently, there are 50-60 features of writing that can be extracted from text, but current programs only use about 8-12 of the most predictive features of writing to determine scores (Powers et. al. 413). Moreover, Thomas writes that “[c]omposition experts must determine what students learn about writing; if that is left to the programmers and the testing experts, we have failed” (29). If compositionists and teachers can enmesh themselves in the creation of software, working with programmers, then the product would likely be one that is more palatable and suitable based on what we know good writing is. While the aura of mystery behind the creation of AES software is of concern to educators, it could be easily addressed by education and involvement. CCC reasons that “… since we can not know the criteria by which the computer scores the writing, we can not know whether particular kinds of bias may have been built into the scoring” (489). It stands to reason, then, that if we take an active role in the development of the software, we will have more control over issues such as bias.

Another point of contention with moving toward computer-based writing and assessment is the concern that high-stakes testing will result in students having a narrow view of good writing, particularly those moving to the college level, where writing skill is expected to be more comprehensive than a prompt-based five-paragraph essay written in 30 minutes. Grand Valley State University’s Nancy Patterson opposes computer scoring of high stakes testing, saying that no computer can evaluate subtle or creative styles of writing nor can they judge the quality of an essay’s intellectual content (Stover n.p.). She also writes that “…standardized writing assessment is already having an adverse effect on the teaching of writing, luring many teachers into more formulaic approaches and an over-emphasis on surface features” (Patterson 57). Again, education is key here, specifically teacher education. Yes, we live in a culture of high-stakes testing, and students must be prepared to write successfully for this genre. But, test-writing is just that, a genre, and should be taught as such – just not to the detriment of the rest of a writing program – something that the authors of Writing of Demand assert when they write: “We believe it is possible to integrate writing on demand into a plan for teaching based on best practices” (5). AES is not an attack on best practices, but a tool for cost-effective and efficient scoring. Even though Thomas warns against “the demands of standards and high stakes testing” becoming the entire writing program, we still must realize that computers for composition and assessment can have positive results, and “[m]any of the roadblocks to more effective writing instruction – the paper load, the time involved in writing instruction and assessment, the need to address surface features individually – can be lessened by using computer programs” (29).

In addition to pedagogical concerns, skeptics of AES are leery of the companies themselves, particularly the aggressive marketing tactics that are used, particularly those that teachers perceive to be threats not only to their autonomy, but their jobs. To begin, companies aggressively market because we live in a capitalist society and they are out to make money. But, to cite Penrod, “both computers and assessment are by-products of capitalist thinking applied to education, in that the two reflect speed and efficiency in textual production” (157). This is no different than the first standardized testing experiments by the Carnegie Foundation at the beginning of the 20th Century, and it is definitely nothing new. Furthermore, Herrington and Moran admit that “computer power has increased exponentially, text- and content- analysis programs have become more plausible as replacements for human readers, and our administrators are now the targets of heavy marketing from companies that offer to read and evaluate student writing quickly and cheaply” (480). In addition they see a threat in companies marketing programs that “define the task of reading, evaluating, and responding to student writing not as a complex, demanding, and rewarding aspect of our teaching, but as a ‘burden’ that should be lifted from our shoulders” (480). In response to their first concern, teachers becoming involved in the process of creating assessment software will help to define the task the computers perform. Also, teachers will always read, evaluate, and respond, but probably differently. Not all writing is for high-stakes testing. Secondly, and maybe I’m alone in this (but I think not), but I’d love to have the tedious task of assessing student writing lifted from my plate, especially on sunny weekends when I’m stuck inside for most of the daylight hours assessing student work. To be a dedicated writing teacher does not necessarily involve martyrdom, and if some of the tedious work is removed, it can give us more time to actually teach writing. Imagine that!

The Future of Automated Essay Scoring

On March 14th, 2007, an article appeared in Education Week that says that beginning in 2011, the National Association for Educational Progress will begin conducting the testing of writing for 8th and 12th grade students by having the students compose on computers, a decision unanimously approved as part of their new writing assessment framework. This new assessment will require students to write two 30-minute essays and evaluate students’ ability to write to persuade, to explain, and to convey experience, typically tasks deemed necessary both in school and in the workplace (Olson 23). Currently, NAEP testing is assessed by AIR (mentioned above), and will no doubt incorporate AES for assessing these writings. In response, Kathleen Blake Yancey, Florida State University professor and president-elect of NCTE, said the framework “Provides for a more rhetorical view of writing, where purpose and audience are at the center of writing tasks,” while also requiring students to write at the keyboard, providing “a direct link to the kind of composing writers do in college and in the workplace, thus bringing assessment in line with lifelong composing practices” (Olson 23). We are on the cusp of a new era.

With the excitement of new possibilities, though, we must remember, as P.L. Thomas reminds us, that while “technology can be a wonderful thing, it has never been and never will be a panacea” (29). At the same time, we must also discard our tendency to avoid change and embrace the overwhelming possibilities of incorporating computers and technology with writing instruction. Thomas also says that “[w]riting teachers need to see the inevitability of computer-assisted writing instruction and assessment as a great opportunity. We should work to see that this influx of technology can help increase the time students spend actually composing in our classrooms and increase the amount of writing students produce” (29). Moreover, we must consider that the methods used to program AES software are not very different than the rubrics that classroom teachers use in holistic scoring, something Penrod identifies as having “numerous subsets and criteria that do indeed divide the students’ work into pieces” (93). I argue that our time is better spent working within the system to ensure that its inevitable changes reflect sound pedagogy, because the trend that we’re seeing is not substantially differently from previous ones. The issue is in how we choose to address it. Instead of eschewing change, we should embrace it and make the most of its possibilities.

 

Works Cited

Asimov, Isaac. “My Own View.” The Encyclopedia of Science Fiction. Ed. Robert Holdstock. Octopus Books: London, 1978. 5.

Associated Press. “Computers Grade Student Writing.” Wired News. 8 May 2005. 18 Jan 2007 <http://www.wired.com/news/technology/1,67458-0.html>.

Barfoot, Harry. “Re: Electronic scoring of writing.” Email to Brigitte Knudson. 21 Feb 2007.

Carter, Steven. “State will bring back online testing next school year.” The Oregonian Online. 24 Mar 2007. 27 Mar 2007. http://www.oregonlive.com/.

Cavanagh, Sean. “On Writing Tests, Computers Slowly Making Mark.” Education Week. 14 February 2007: 10.

“CCC Position Statement on Teaching, Learning, and Assessing Writing in Digital Environments.” CCC 55.4 (June 2004): 785-790.

Dikli, Semire. “An Overview of Automated Scoring of Essays.” Journal of Technology, Learning, and Assessment. 5.3 (August 2006). 3 April 2007 http://www.jtla.org/.

Gere, Anne Ruggles, Leila Christenbury, and Kelly Sassi. Writing on Demand: Best Practices and Strategies for Success. Portsmouth, NH: Heinemann, 2005.

Herrington, Anne, and Charles Moran. “What Happens When Machines Read Our Students’ Writing?” College English 63.4 (March 2004): 480-499.

Huot, Brian. (Re)Articulating Writing Assessment for Teaching and Learning. Logan, Utah: Utah State University Press, 2002.

Hurwitz, Sol. “Indiana Essays Being Graded By Computer.” The New York Times. 19 May 2004. 27 February 2007. http://www.nytimes.com/.

Olson, Lynn. “NAEP Writing Exams Going Digital in 2011.” Education Week. 14 March 2007: 23.

Patterson, Nancy. “Computerized Writing Assessment: Technology Gone Wrong.” Voices from the Middle 13.2 (Dec 2005): 56-57.

Penrod, Diane. Composition in Convergence: The Impact of New Media on Writing Assessment. Mahwah, New Jersey: Lawrence Erlbaum Associates, 2005.

Powers, Donald E.; Burstein, Jill C.; Chodorow, Martin S.; Fowles, Mary E.; and Kukich, Karen. “Comparing the Validity of Automated and Human Scoring of Essays.” Journal of Educational Computing Research. 26:4. 2002. 407-425.

Russell, Michael, and Walt Haney. “Testing Writing on Computers: An Experiment Comparing Student Performance on Tests Conducted via Computer and via Paper-and-Pencil.” Education Policy Analysis Archives. Ed. Gene V. Glass. 5.3 (Jan 15, 1997). 5 April 2007 http://epaa.asu.edu/epaa/v5n3/board/petrie.html.

Savage, Howard J. Fruit of an Impulse: Forty-five Years of the Carnegie Foundation. New York: Harcourt, Brace, and Company, 1953.

Stover, Dell. “Computer-grading of essays gaining a foothold.” National School Boards Association. 17 May 2005. 6 March 2007. http://www.nsba.org/.

U.S. Census Bureau. “Annual Estimates of the Population by Sex, Age and Race for the United States: April 1, 2000 to July 1, 2005″ (NC-EST2005-04). 10 May 2006. http://www.census.gov/popest/national/asrh/NE-EST2005-asrh.html.

Whithaus, Carl. Teaching and Evaluating Writing in the Age of Computers and High-Stakes Testing. Mahwah, New Jersey: Lawrence Ernbaum Associates, 2005.

Yancey, Kathleen Blake. “Looking Back as We Look Forward: Historicizing Writing Assessment.” CCC 50:3 (Feb 1999): 483-503.

[1] The information about the testing in these two paragraphs is based on my experiences in a Michigan high school, where I served on a testing logistics committee and was responsible for proctoring the MME.

NOTE: Essay Copyright 2008 by Brigitte Knudson.

If you wish to use any parts of it, or have any comments, please email bknudson@wayne.edu.