Thanks to Donna Garner for posting this on EducationLoop yahoogroup. She
reveals that Texas scorers were not allowed to give out any "A" level
scores, no matter how good the papers were.
This was the same problem with North Carolina writing project that I
scored. They would move definitions up and down day by day depending
on whether they had too many or too few scores. High score was a 4,
but you had to have permission to give out a 4. Published figures on
their website after the fact showed that fewer than 1% were given a 4.
Top 1% is good enough to get into Harvard or Yale, but not good enough
for an "A" on this test. If you said "integrity is a good thing" and
produced a New York Times article that re-worded a press release, but
said nothing new, that was a 1. If you said that "Martin Luther King
had integrity", that was a 2. If you gave a real person like GW Bush, a
person from the novel "Night", and a story about your mother, that was
a 3. If you wrote a Wall Street Journal op-ed piece, that MIGHT be
good enough for a 4, and in this TAKS example, would not be good
enough if no high scores were permitted.
Contrast that to my son's 10th grade writing WASL - 20% got a PERFECT
score, which would mean perfect convention and perfect content. This
also correlates to California CLAS where there were ZERO "4" math
scores statewide, and the woman in charge told Washington state
legislators that this was deliberate to "give room for improvement".
The entire process is designed to produce low scores at the start and
inflated them to produce high pass rates at the end. Washington
Republicans in my state are still blaming killing the WASL on liberals
who don't have standards. We have to keep on message that "standards"
is jus another word for "outcomes" and that "outcomes based education"
was and still is a disaster. It's also how businesses are being run
when you have to rank people and pay them by "performance-based
evaluation" and produce "continuous improvement" or else.
TAKS SCORING BY PEER PRESSURE -- BY DONNA GARNER -- 3.23.09
Posted by: "Donna Garner"
Mon Mar 23, 2009 5:50 pm (PDT)
"TAKS Scoring by Peer Pressure"
by Donna Garner
March 24, 2009
From time to time, I receive anonymous e-mails from people who read my
reports that are posted on the Internet. Last week I received one such
e-mail. This person said he had seen my 4.15.08 article entitled "An
Exposé of the TAKS Tests." (Please see my three attached reports on
Texas' state-mandated TAKS tests.). He said my concerns echoed his own
personal concerns about TAKS scoring because he had worked as an
experienced scorer for Pearson, the company that has the contract with
the Texas Education Agency (TEA) to develop and score students' TAKS
tests. My anonymous source (i.e., "John Doe") confirms my long-held
concerns that having subjectively scored sections on high-stakes tests
is an open invitation for manipulation of scores.
John had been an educator for many years and had decided to work as a
scorer for Pearson on the English / Language Arts (ELA) TAKS scoring
project. He became very uncomfortable about the harsh and inconsistent
manner in which the scoring was done and finally quit the job as did
many others. John said the scorers were forced to give low scores to
students who demonstrated exemplary writing skills but higher scores
to those students who were less deserving.
He said, "There was what I call an 'unspoken no 3 rule' on the
expository portion of a reading comprehension question [open-ended
response questions]. By unspoken, I mean that we weren't explicitly
told in so many words not to give a 3, but that we should obtain the
express approval of our supervisor before so doing. Whenever a scorer
would request permission to give a 3 on a particular paper, the
supervisors would not give their consent. In due course, many scorers
began to stop giving 3s altogether. I failed to see the logic in
this." [On the ELA-TAKS, Grade 11, Spring 2008 administration, 0% of
students in Texas made 3's on the open-response questions.]
John went on to say that an adaptation of a Readers Digest article
instructed the students to explain why they thought a particular
person was a hero or was not a hero. John said that the prompt along
with scoring materials contained major problems which should have been
resolved prior to reaching the scorers' desks. "The anchor papers
(i.e., papers used as examples) and rubric all contained several
errors and inconsistencies. In some cases, the annotations under the
student-written portion were not illustrative or supportive of the
examples given. Some of the examples provided did not even reflect the
goals stated in our manual."
John told me that during the training session for the job, various
people questioned the Texas representatives about the problems with
the anchor papers. The scorers were told to ignore the problems and
score them anyway. One of these Texas Education Agency representatives
was Victoria Young. "We were told not to rely upon our anchor papers
but to use the rubric more. The problem was that the language used in
the rubric was very general and over-broad. This created too many
loopholes and led scorers to drift, either giving too many high scores
or too many low scores. This undermined the integrity of the scoring
altogether. Anchor papers were far more precise than the rubric and
were essential to accurate scoring, especially on the open-ended
John explained that many of the scorers expressed their concerns to
Pearson management about the inequities, but the managers simply
shifted the blame back to Texas and said they were powerless to do
anything about it. If the scorers did not "go along to get along" at
Pearson, they were considered to be renegades and were treated with
group disapproval. The scorers were heavily criticized if their scores
were too high or too low; and when they tried to explain their
rationale for giving the scores they did, they were treated like
"naughty children who refused to obey."
In addition, a spreadsheet was circulated around the scoring room
periodically so that John and the other scorers could compare their
scores with the rest of the scorers (e.g., to see if they were giving
too many 1's or 2's). If so, they were told they had to change; but
they did not know how to begin because no specific work samples were
provided for reference. In fact, John said the Pearson scoring system
was not based on accuracy but on general agreement and consensus of
opinions. Roughly, only 20% of the papers scored were backread by
supervisors and/or directors. "It's basically a majority-rules system,
where conformity with your fellow scorers is of the essence, and
quality control plays little if any part at all."
One of the most disturbing statements made by John is that last year,
Pearson began a policy whereby bonuses were given to scorers based on
several criteria such as high validity scores, productivity levels
(number of papers scored per hour), and other subjective factors
determined by the supervisor and scoring director. John explained that
validity is one of the criteria used to monitor and gauge a scorer's
consistency in performance and/or application of the rules, but the
problem was there was no way to measure accuracy. He said it was
feasible that a scorer could be consistently wrong but still wind up
with a bonus. If a scorer made a mistake at the beginning of the
project and tried to correct it later on, he would receive a lower
validity score that affected his bonus. One supervisor actually told
one of the scorers that if he wanted a bonus, he should keep on
repeating what he had done at the beginning even if it was wrong!
I continue to say that subjectively assessed sections (including
portfolio assessments, open-response questions, and essays) on
high-stakes tests are ludicrous. People's opinions, peer pressure, and
manipulation take over; and the final scores then become meaningless
as measures of student academic performance. At least 80% of any
high-stakes test should be based upon objectively tested,
right-or-wrong answers. This is only common sense.
Donna Garner wgarner1 at hot.rr.com
Writer/Consultant for MyStudyHall.com
English Success Standards (K-12)
Please let me know if you do not wish to receive my e-mails, and I will take you off my address list. Thank you
Back to top Reply to sender | Reply to group | Reply via