After a long stretch of silence I return with a newsletter that is
going to land in your in boxes with a thud. It results from two really
excellent Brown Bags. In fact, one of the Brown Bags, featuring Cathy
Taylor on the WASL, inspired me to write the longest column I have ever
produced for the AWM Newsletter, duly included here. Despite its being
long, though, I couldn't stand to ignore the other Brown Bag, featuring
John Palmieri and Jack Lee on Writing Courses, because that one was
also a classic. I will start with the latter, albeit briefly.
The issue of teaching students to write mathematics
and science well is hardly a new one, here or elsewhere. A number of
years ago an attempt was made to address it by designating certain
courses to be writing courses. Unfortunately, the designation was
rather a shapeless one, and eventually it became evident that the
writing aspect was having very little impact on students taking the
courses. Recently the university's administration decided that it was
time to admit that the old system was broken and work towards a new
one. Grants were made available to faculty members to try out new
ideas. John Palmieri won such a grant and, joined by Jack Lee, set
about trying out his plan. The plan was based on the theory that no
amount of commentary on a paper was going to penetrate deeply into a
student's understanding unless the student also had to act on that
commentary. Correspondingly, a certain number of problems were
designated "Portfolio Problems". These went back and forth through
several iterations, with the student responsible for creating a new and
improved draft each time. Along with this came a very clear exposition
to the student of what the goal was -- what kind of clarity and
conciseness constitute good writing in mathematics.
It doesn't take much reading between the lines to
register that that format puts a huge burden on the professor. Jack and
John acknowledged that, and had some mechanisms for offsetting it. On
the other hand, both came out with quite positive feelings about the
effort, based on student response and student progress. They are even
planning to run a workshop on how to run such a class -- there's
conviction for you!
Onward to the following week's Brown Bag, as written up for the AWM:
It was in 1990 that I began to put into action a
plan to expand my love of teaching mathematics into some knowledge
about Mathematics Education as a field. My opening salvo was attending
an MER (Mathematicians and Educational Reform) workshop. Most of the
sessions were fascinating, but one of the optional ones caused me to
give a delicate shudder: Assessment. How could any respectable person
occupy their time with such a grungy topic?
I've come a long way since then. I've even become
intrigued with, and played around with, a number of forms of classroom
assessment, some of them modifications of the classic
sit-down-and-shut-up test, some rather farther into left field.
Simultaneously I have been aware of the assessment effects of the
reauthorization of Title I of the Elementary and Secondary Education
Act, which resulted in almost every state producing its own standards
and assessments, and the cataclysmic impact of No Child Left Behind.
These two have given me a constantly increasing awareness of the
complexity and importance of large scale assessment of the learning and
teaching going on in schools statewide.
Recently the last remnants of that original reaction
were erased, and I came to realize that there are people occupying
themselves with assessment who are not merely respectable but stellar.
Furthermore the rest of us owe them a great debt of gratitude. In the
process of learning that, I also found out a number of details and
connections that had hitherto eluded me. For me, the context is the
state of Washington, but the issues involved are present in all 50
states. My impression (small attack of chauvinism) is that Washington's
procedures were particularly exacting, and the number of people
involved and degree of follow-through were also outstanding. This I
leave to the reader to figure out by checking on his or her home state.
My source of all this information was a pair of
talks by my colleague Catherine Taylor, who is a professor in the
University of Washington's College of Education. Her field of specialty
is Assessment, and she has recently returned to campus after a three
year stint as adviser to the Office of the Superintendent of Public
Instruction. She spoke first to a bunch of members of the mathematics
department, and then to a bunch of graduate students who have been
working with K-12 teachers. Each group came in armed with many negative
reactions to our state's current test, and in each the mood change was
palpable. As one of my colleagues in the Math Department put it: "I'm a
convert!"
So what was it we learned? It started with some
prehistory: the original Title I Act. It was passed by Congress in the
late sixties with the admirable intention of improving the education of
underachieving poor students. Unfortunately it had some fatal flaws,
such as a provision that each school must keep improving its students'
test scores, but if the scores improved beyond a certain point the
school would abruptly lose its funding. Eventually a study by John
Cannell unearthed some dramatic findings – for instance, that
test manipulations were managing to make the average performance in
nearly all states be above the national average – and some
unpleasant consequences of the format. The response to this was a 1993
reauthorization of Title I that mandated that states create their own
academic standards and allowed them to choose or create their own
assessment systems. The Washington legislature then set up a Commission
on Student Learning (CSL) to address the task of producing both the
standards and the assessment system. That's where things start getting
impressive. The CSL didn't simply sit down and start writing. They
assembled committees of educators and community members from throughout
the state, and used their input. From that they produced the Essential
Academic Learning Requirements (EALRs – pronounced as if they had
something to do with long, thin fish). Then they sent the proposed
EALRs out for review by an even larger community and revised them based
on the reviews. The EALRs form a careful, thoughtful set. In
mathematics they strongly reflect the NCTM Standards, with an emphasis
on understanding and using mathematics, with computational fluency to
be based on understanding of the operations being computed with. They
also feature the inclusion of problem-solving, mathematical reasoning,
mathematical communication, and connections as part of the content
standards.
And that, with all of its community consultation and
review and multiple re-writings, was the easy part. After it came the
construction and management of the WASL (Washington Assessment of
Student Learning). Catherine gave us a full page diagram of the steps
and stages of that, and filled in with further details. I lost track of
the number of iterations of writing and reviewing and re-writing that
went into it, but I do know that well over a year of work went into it
before the first field test was run, and that's less than halfway down
her page. Then came pilot testing and a huge job of figuring out the
scoring. The test is criterion referenced rather than norm-referenced,
which means that instead of being designed to produce a bell-shaped
curve of scores, it aims basically to establish whether students have
reached a level of proficiency appropriate to their grade level. Given
that the EALRs had established that proficiency to include reasoning
and ability to communicate, pure multiple-choice testing was clearly
out of the question. There are some multiple-choice items (I liked
Catherine's example of "Which of the following pieces of information do
you have to have in order to solve the problem you just read?"), but
also short-answer questions, where the answer must include some form of
justification, in words, pictures, graphs, diagrams or whatever else
the student chooses to use, and extended- response questions that open
out in many directions. Next a consistent scoring system was
established, then data for items were analyzed to select those that
would be used on future tests.
After the test was administered for the first time, a collection
of people closely in touch with children of the relevant age took the
test themselves and estimated where they would put the bar. Parents and
teachers put it high, administrators put it low. Information about what
percentage of the students who took the pilot test would be rated
proficient given each of the bars was eventually released into the
conversation, after which a suitable compromise took effect.
Meanwhile the test items were field-tested in a
large number of school districts and then examined by experts
(including Catherine) for all manner of biases. The check for cultural
bias ran beyond academic expertise – folks from OSPI held fora
within various ethnic communities and learned yet more. For instance, a
Native American elder pointed out that children of his nation would not
get as far as the mathematics of a problem based on a survey, because
it is not in their culture to ask questions of a stranger. Catherine
ran multitudinous statistical tests for bias and found, for instance,
that on the short-answer questions girls and minorities were at a
slight advantage, and on the multiple choice question boys and whites
were at a slight advantage, but the advantages balanced out.
The writing assessment specialist for Washington's
Department of Education worked with the scoring contractor to set up a
rigorous training system for scoring the tests, which is done by
teachers hired for the purpose. Tests run on the resulting scores
indicate an extremely high rate of consistency in grading. In short,
this is a really classy assessment.
Then we get to the issue of public reaction. That's
where the egg hits the fan. Partly, of course, that's because on any
issue the noise tends to come from the negative. Beyond that, though,
are some deeper issues. A fundamental one is sheer unfamiliarity.
Teachers inevitably teach to the test – the system pretty much
demands it – and for many years most tests have been geared to
speedy production of calculations. The WASL is designed to change the
whole slant of the assessment, and not only is that disorienting, but
it very demanding of teachers. On the other hand, to whatever
extent teaching to the test changes the slant of the teaching towards
achieving the EALRs, the pain is offset by some genuine benefits. Less
easy to offset is the incredible pressure put on schools, and thence
teachers, by the high stakes introduced by the No Child Left Behind
Act. Of course teachers shouldn't pass the stress along to their
students, but it's very hard not to. And what parent likes to see a
child quaking at the thought of a test?
With all this information rattling around in my
head, I've been pondering what we as mathematicians, Washingtonian or
not, can do. So far all I have been able to come up with is "Find out
more". I propose this not simply as an intellectual exercise, but so
that we who might actually be listened to have an answer to questions
like "My fourth grade son hasn't learned long division yet and I
learned it in third – doesn't that mean he is getting less math?"
[Answer: not if the time that would have been spent on the mechanics of
division goes into its conceptual underpinnings] or "They used to put
addition of fractions with unlike denominators on the fourth grade test
and this one doesn't have anything nearly that advanced – isn't
that a dumbing down?" [Answer: a norm-based test is designed to spread
scores out along a curve, so it puts in questions that are way above
and way below expectations in order to make distinctions among students
who are far away from the norm.]
And, of course, my recurrent response to educational
issues: stand by to support K-12 teachers in any way you can –
they are a beleaguered population if ever there was one!
Addendum: Catherine very kindly proof-read my original draft of this
column and corrected the more egregious of my errors. She then produced
a comment on my final paragraph that I liked so much that I shall now
reproduce it, thus converting hers to the column's concluding paragraph:
Mathematicians should look at what is in the tests because what is
tested is what will be taught. If they think that kids should learn to
think mathematically or attack ill structured problems with some
confidence, be able to apply math concepts and procedures in real world
situations, graph, diagram, etc., then they should be looking to see if
that is what is being 'valued' on their state's test. What is tested
tells kids what is valued AND what is tested tells them what it means
to be a mathematician or to use mathematics. That's why our culture is
so math phobic - we have a very skewed idea about what it means to DO
mathematics.