Wednesday, December 24, 2014

The civilized veneer of Florida's school accountability system.

One hypothesis is that Jeb Bush's A-F system is a simple and easy to understand method with easy to interpret results.  But this is Florida mixed in with some educrats, sooo....

The system isn't as complicated as I thought (see for yourself here), but there are some oddities.  For example, most people probably think "90+%" when they hear "A."  For Florida, it's 70% (1120/1600) or better.  So a large block for "A" schools.  It gets weirder:  a B has a range of 65%-70% (1050-1119).  A 5% range for B's? Here's the whole shebang:

 (All scores out of 1600)
A:  1120+  (70%, range of 30%)
B:  1040-1119 (65%-70%, range of 5%)
C:  880-1039 (55%-65%, range of 10%)
D:  800-879 (50%-55%, range of 5%)
F:  <800 (below 50%, range of 50%)

It seems like they're trying to use a standard curve (1600 a la the SAT) and the traditional A-F, but not unsurprisingly, failing at both.  But this isn't even the worst part--it's how they arbitrarily shift the numbers they use in the calculations around.  For example, a week ago

"Education Commissioner Pam Stewart said a key reason for the drop in A-rated schools was that the grading formula was changed to make it more difficult to earn a top grade."

Now rewind two years:

"State education officials panicked, and at an emergency meeting last week, the Florida Board of Education decided in a 4-3 vote that the best thing to do was to lower the passing score on this exam.
Let me repeat that: In order to make sure that students succeeded on the test, the passing grade was lowered."

So, when it comes down to it, all the math, all the calculations mean nothing--the numbers are fudged to whatever looks or feels "right" to the powers that be.  Once again, the math is just there to provide a false sense of credibility.

Friday, November 14, 2014

Could grade inflation have "helped" lead us to this testing mania?

 *  Note:  I changed the title to be more clear.  There is no one simple reason for how we got where we are.

While I can't stand the amount of time, tax dollars, and opportunity costs of all these standardized tests, there is one thing that has always gnawed at me--they are, sadly, somewhat relevant.  Granted, they are way overdone, but as I have told my students in the past, with grade inflation, some--nowhere near as many as what we have*--standardized tests are needed.  This article from Angelina Massoia puts it quite elegantly:

"By submitting to the culture of grade inflation, we empower the standardized test to 'accurately' represent us."

Part of the reason I believe that it's going to be hard to stop this testing madness is that to some small degree, the field of education itself is to blame.  The article above notes that "about 43 percent" of all grades given in college are now an A.  More importantly, it's far worse in education (image source):

This is hardly a surprise, I would hope; I'm sure there's plenty of other, older research (such as this, indicating "Education majors enjoyed grade point averages that were .5 to .8 grade points higher than students in the other college majors.").

Again, I am in no way advocating the testing mania.  What I do believe/wonder is if the "reformers" and other test advocates may have been able to get their grip on our system due to grade inflation and if the Colleges of Education don't share some blame* (whoops, sorry again--I meant "accountability") for this.  This article (again from Huffington) goes over the full study above in more detail.  I take issue with it in that it should have made the distinction between "easy" and "academically rigorous."  I can't say whether they are easy, having not been an ed. major but I am finishing up my 5th ed. course in <1 year.  From this experience--and from what colleagues have told me of their own--I can say education majors are far from academically rigorous.  (I dare say that even my regular chemistry class is far more academically challenging then they have been.  Given the past 10+ years of our leaders' use of "rigor" as a buzzword, I find it rather hypocritical that they probably wouldn't recognize it if they saw it.  Or be able to handle it.  But those are subjects for another blog.)

*  And let's be honest, this is being done to blame teachers and schools.  Oh, wait, I'm sorry, I meant "Hold them accountable [if things go bad, otherwise, the reformers will take the credit]."

Tuesday, November 11, 2014

Opting Out May be Education's Only Option

Hopefully the opt-out movement is (finally...) reaching a critical mass.  Large-scale (200 students) walkouts in Colorado and a recent article in the New York Times highlighting the grotesquely wasteful (ab)use of standardized tests are starting to make waves.  (I would have liked to see the NYT article go into more detail on the opportunity costs--60 to 80 days spent on testing per year!?!--of these tests, as I attempted to here.)  I see opting-out as the only real solution at this point, since both Republicans and Democrats have proven themselves to be utterly worthless (see:  No Child Left Behind, Race to the Top).

1.  Opting out devastates the school's ranking, especially if the good kids opt out.  (I hope/assume that this was encouraged by these high-achieving students and/or their parents.)  This in turn pressures the school board to do something--preferably/hopefully dropping these asinine ranking systems and in turn, their tests.

2.  As more students opt out, the easier it is to see the data being collected is not good data.  Hopefully people start to realize they're spending billions of dollars collecting data that's between dubious and worthless, again pressuring school boards--already strapped for cash--to jettison these tests.

Monday, September 15, 2014

Newsweek and "Cause and Effect"

Oh, Newsweek and your "Best High Schools" list...if only you'd point out the obvious:  The "best" schools are those with the best students.

I imagine that "the list" has become something of a cash cow for Newsweek; there's simply no way they can't realize the list is of no value.

Numbers for Household income from Wikipedia pages.  Chicago was split into several sections so I wasn't sure which MHI to use.  (Not that it matters since the school is a selective admission, as are most, I assume; I didn't bother to check the ones that were well above the US average MHI.)

Monday, August 25, 2014

VAM for Kindergartners? I say VAM for legislators.

The insanity of "big data" has finally hit the mainstream media, as the Washington Post notes that in Florida, kindergartners are technically-sorta-maybe required to take end of course (EOC) exams.  That's right--six-year-olds.  Already the kindergartners politicians are working to cover up their mistake by claiming the law was misinterpreted.  That's much better--because everyone knows you should wait until kids are seven years old before they start taking final exams/EOCs.  Eight at the latest.  (/sarcasm)  I mean, this is common sense here--you can't put peoples' livelihoods in the hands of kids.  Especially when teacher effects are just a small part of picture.  (The numbers appear to vary widely--which makes sense, given difficulties in measuring--but a quick search shows non-school effects range from ~67% to as much as 89% of learning/student success.)

So here is my solution to the problem:  VAMs for our legislators.  If they don't improve our education system, they will be ineligible to run for at least one election cycles.  NO exceptions.  A few key inputs for the model (please feel free to add/modify):
1.  The laws must be shown to have improved education.
2.  The laws must be shown to be cost-effective.
3.  Teacher morale will be factored in.
4.  Graduation rates must improve. This includes elementary and middle school.  (This too will require a cost-benefit analysis.)
5a.  Flexibility in education is key, so how quickly bills can be passed will be taken into account.
5b.  How much support the bill gets in the legislature will also be factored into the VAM.

Who's willing to bet that the instant the politicians realize they can be fired because of things nearly completely out of their control they suddenly have a very different opinion of VAM?  Anyone?  Anyone?

Update:  Gov. Rick Scott calls for review of Florida's standardized tests.  Combined with Arne "Duh" Duncan's latest, there's hope yet...better several-years-late than never to realize that testing mania has gotten out of hand.

Tuesday, April 22, 2014

Was proctoring the FCAT worth 50+ million of your tax dollars every year?

It currently costs $12,700 dollars a year to educate a student in the US public schools.  That works out to about $70/day.  Here in Florida, that number shrinks to $8,887/student, or about $49.40/day (let's just call it $50/day).  Today's FCAT glitch probably just cost a half-day for each student; "thousands" of students were affected.  That's ~$25/student/test that was probably completely wasted.  So are we talking $50,000 of tax dollars down the drain?  $100,000 (4,000 students)?

Is all of this time being spent to collect data (often of dubious quality, to be polite...) a waste and every half-day spent testing is $25/student thrown away?  (And I'm guessing between VAM tests, progress monitoring tests, and FCAT, the average Florida student spends close to a full week of school testing.)   The total opportunity cost caused by lost class time and proctoring these standardized tests must be staggering.

Or, put another way--is proctoring the FCAT worth $50 million?*  Oh, and don't forget Pearson's contract with the state--that costs you another ~$51 million a year.  I have a hard time believing this is worth it; it's probably enough to run a small school district for a year. 

*  There are 2,587,000 students in Florida's public schools.  If that distributes evenly amongst grades K-12, that works out to 199,000 students/grade.  It looks like nearly every grade above 2nd tests (at least once), so that's 10 grades, or ~2,000,000 of the students were tested every year.  So two million tests multiplied by $25/student/test yields ~$50,000,000.  (This is admittedly a pretty rough estimate with some mutant statistics.)

Wednesday, November 20, 2013

"I'm not even going to read the question."

...So said one of my students, prior to my handing out the wonderful test for my VAM score today.  And while this student wound up actually reading (at least) some of the questions, others wanted to start filling out the bubble-sheets before they even got the test.  Others were done with the 40-problem test within ~5-10 minutes. So please, tell me highly-paid "educational experts", why is it a good idea to judge me on these test scores?

How many millions of dollars are being spent collecting junk data for a statistically flawed analysis that is shown to be inappropriate and has essentially no basis as far as effectiveness is concerned?  It's stupid bogus sophistry (BS) like this that makes me think the schools probably would have enough money if the people at the top (Federal, State, District) knew what they were doing.

Wednesday, May 29, 2013

Visible Learning, Invisible Evidence

So I'm "done" (returning) Hattie's Visible Learning tomorrow.  I read over the first two chapters; didn't really focus on the actual "meat" of the book as I don't think the numbers mean squat.  They are at best extremely unreliable; I'd love to see someone try to test some of these numbers.  (i.e., focus on one strategy, test it repeatedly, and see if the results come back anywhere near the average Hattie presents.  Or even take a few [large] random samples of older research and see if the same number comes back up.)

A few of my questions/comments/concerns:

1.  If these effect sizes are accurate, why can a teacher not focus on 2-3 things and thus be more-or-less a "great" teacher?  If these evaluations' [e.g., Marzano] checklists aren't checklists as claimed, that is, "It's stuff [we're] already doing in class"...well...with all these great effects, why isn't virtually every teacher great?  I see three possibilities (not mutually exclusive):
i.    Virtually every teacher is not doing them (and there are a LOT of them) enough.
ii.   Virtually every teacher sucks at virtually every one of them.
iii.  The numbers suck.

(Technically I can think of a fourth but I excluded it; there is the--illogical--possibility that the numbers are somehow not cumulative.  But if that's the case, it destroys the whole argument for implementing these strategies.)

2.  Hattie states that a d=0.4+ is the "zone of desired effects".  Yet he also states, "Further, there are many examples that show small effects may be important" and goes on to mention a study with a d = 0.07 wherein "34 out of every 1,000 people would be saved from a heart attack if they used low dose aspirin on a regular basis".  Well, if it effects 34 out of 1,000 people, it would save 1.9 million out of ~55 million.  I use this latter number because that's how many K-12 students there are in the US.  Obviously this wouldn't be as signficant as a life-or-death situation, but if it's going to help (rather than save) that many kids, is it worth looking into?  To quote Hattie, "This sounds worth it to me." (pg 9)  Hattie's "hinge point" seems purely arbitrary.  This also highlights the difference between the (pseudo)scientific approach of meta-analysis in the medical field and education, which leads to...

3.  Applying a scientific approach to unscientific data results in unscientific results.  And seeing as how this whole book strikes me as just yet another attempt to latch onto science's credibility (something educational research, generally speaking, does not have), that's a big deal.  In fact, there's something absurd about even having to discuss whether the quality of the data matters (pg 11).  Case in point:  He cites Torgerson et al. (2004), who used 29 out of 4,555 potential studies on a subject area.  These were chosen as "quality" (Torgerson's definition) studies because they used randomized controlled trials.  That helps improve the quality of your data, alright, but...what about the other 4,526?  99.4% of the research didn't use random trials?  The best education can typically do (not faulting education, it's just the nature of the beast) is "quasi-experimental" studies.

4.  Another problem with the data that puts a big red flag on all these numbers (again, GIGO):  There are no real (scientific) controls in educational research.  A control is a "yes/no" situation; Group A gets the experimental treatment (e.g., a drug) and Group B does not (e.g., a placebo). Obviously you can't do this without doing something tantamount to child abuse (i.e., standing there and doing absolutely nothing)...but frighteningly, that's the only meaningful "control" there could be.  (And that's one reason why education data will never be scientific in nature.)

5.  Barring a strictly regimented routine (one that could probably be automated via presentation software), it's highly unlikely two teachers using the same "technique" will apply it identically.  (And the same goes for the "controls" above; what teachers replaces the experimental technique with will differ, rending comparisons dicey) This leads to another "apples and oranges" scenario for meta-analysis (albeit admittedly a relatively weak one).

6.  More apples and oranges:  One technique may be effective at one grade level but not another.  I have no problem accepting that having a learning goal may help first graders.  They may need the focal point and they are (I believe...) general subjects/topics.  I have a hard time accepting that having written on the board "student is going to factor tri-nomials" is going to have a significant impact on seniors in algebra.  (Anecdote:  My students have repeatedly mocked/made derisive comments when they see me changing the learning goals.  For example:  "You know we never look at those, right?"  "Yes, I know, it's just something I have to do."  Very empowering, let me tell you...)  Mushing multiple grades together into one statistic is just a bad idea.  Ditto for different subjects (at higher grades).