Tuesday, February 17, 2015

Why Haven't Florida's Test Scores Shot Through the Roof?

We're now in year three of Marzano's "causal" model here in Florida.   With all these effect sizes (where increases of 10-20 percentile points are common) AND with 98 percent of Florida's teachers getting effective or better ratings, shouldn't Florida's test scores be through the roof by now?  Because they're not.  In 2013 they were stagnant from the year before.  And this year?

"The average scores for Florida's class of 2014 were 491 in reading, 485 in math and 472 in writing, all below the national averages and lower than last year's scores, when a smaller group of state students took the test.  Across the country, SAT scores were stagnant..."  (Source.)

So what's the problem?  The model or its implementation?  My guess is both (plus the research behind the model).

Needless to say, its implementation has been a nightmare--the fact we're still trying to implement it three years later (at God only knows what expense) should tell you everything you need to know.  Or you can Google articles like this.

The model itself, despite the claims on Marzano's website, appears to have zero research supporting it (it being the model itself).  Check out his website's claims:



 Going in order (1-5):
1.  Where have I heard this 5000 studies over 50 years claim?   Here, in a journal article co-authored (second author, it seems) by Marzano.  Turns out only 70 (1.4%) of those 5000 studies "met the researchers' criteria for design, control, data analysis, and rigor."  This would appear to be, at best, intentionally misleading.  And considering there's no way these 5000+ studies, done decades before the model existed, were actually testing the model.  Strike one.

2.  This is completely irrelevant to verifying the model.  Once again, "It's good because there's a lot of it--and a number!"  If anything, the fact this much stuff is out there and had no effect, I would say, works better as a counterpoint to its effectiveness.  Strike two.

3.  I won't rehash the hazards of the analyses, but it's important to note that this is not evidence for the model itself. We're now 0 for 3.  (Are these studies double-dipping from #1 above?  I bet they are.)

4.  Finally, something directly studying the model!  Except there's nothing listed.  One would think, with this being the most important thing listed here, there would be a link, or a title, SOMETHING.  Zero for four...

5.  This paper is basically a "Why you should use our Model"; I see nothing in there verifying its effectiveness.

And we're 0 for 5 now.  (Which, if some people get their way, isn't a zero, but a 50%; no wonder "educational experts" approve of this baloney.  Regardless, 50% or 0%, this is a failure.)

So we have a convoluted and horribly expensive evaluation system that doesn't seem to be producing significant (if any) results after 3 years.  Solution?  Let's drop this until a third-party verifies its effectiveness and go back to a cheap, simple solution--which will likely cause no discernible changes to our education system (besides saving money).

Friday, February 6, 2015

Insanity of the Marzano Evaluation Redux...

My "Insanity" post re: the Marzano evaluation system seems to be the only "popular" post I have so I thought it might be time to revisit it--this time in fewer words.  I encourage everyone to sign up at his website to access the database themselves.  Then go through and take a look.  Before we begin, I urge you to keep in mind the GIGO paradigm:



Sorting by p, you quickly see a lot of data that seems to have no value reported; I'm fairly certain that with a sample size of 8, a p of <0.0000 is not going to happen. 


 And, eyeballing the scrollbar, it would seem that ~66-75% of Marzano's database is statistically INsignfiicant.  ("...in Statistics 'significant' means probably true (not due to chance)") Combined with the portion above (all those zeros...), I personally find this to be, well, damning, for lack of a better word.



Now, before putting in these last two pictures, I would like to once again quote Marzano's description of the p values:  "Basically, if the value in this column is less than .05, the effect size reported for the study can be considered statistically significant at a significance level of 5% (that is, the probability of observing such a result by chance is less than 1 in 20). In other words, a reasonable inference can be made that ... the reported effect size represents a real change in student learning."

Behold the "better than a coin flip" point:


And, this is definitely my favorite:  Seven results that are 100% likely to be random fluctuations. 



It's the epitome of hypocrisy that in this day and age of "data driven decision making" we're relying on this type/quality of research to make decisions.  It's like our politicians and educrats have never even heard of the phrase "peer review."  Again: 




To Marzano's credit, he's stated his research is being misused.  But I don't see him making a big fuss in telling states like Florida to stop basing their evaluations on his work, i.e., stop buying his books and materials.



BONUS:  This is one of my favorite pictures, when you sort it by N (C), that is, the size of the control.  You used a study that had ONE kid as a "control"?  Really?  I mean, REALLY?  You don't need to be a statistician to realize that public policy should not even be influenced, yet alone based on this analysis.




Wednesday, December 24, 2014

The civilized veneer of Florida's school accountability system.

One hypothesis is that Jeb Bush's A-F system is a simple and easy to understand method with easy to interpret results.  But this is Florida mixed in with some educrats, sooo....

The system isn't as complicated as I thought (see for yourself here), but there are some oddities.  For example, most people probably think "90+%" when they hear "A."  For Florida, it's 70% (1120/1600) or better.  So a large block for "A" schools.  It gets weirder:  a B has a range of 65%-70% (1050-1119).  A 5% range for B's? Here's the whole shebang:


 (All scores out of 1600)
A:  1120+  (70%, range of 30%)
B:  1040-1119 (65%-70%, range of 5%)
C:  880-1039 (55%-65%, range of 10%)
D:  800-879 (50%-55%, range of 5%)
F:  <800 (below 50%, range of 50%)

It seems like they're trying to use a standard curve (1600 a la the SAT) and the traditional A-F, but not unsurprisingly, failing at both.  But this isn't even the worst part--it's how they arbitrarily shift the numbers they use in the calculations around.  For example, a week ago

"Education Commissioner Pam Stewart said a key reason for the drop in A-rated schools was that the grading formula was changed to make it more difficult to earn a top grade."

Now rewind two years:

"State education officials panicked, and at an emergency meeting last week, the Florida Board of Education decided in a 4-3 vote that the best thing to do was to lower the passing score on this exam.
Let me repeat that: In order to make sure that students succeeded on the test, the passing grade was lowered."

So, when it comes down to it, all the math, all the calculations mean nothing--the numbers are fudged to whatever looks or feels "right" to the powers that be.  Once again, the math is just there to provide a false sense of credibility.

Friday, November 14, 2014

Could grade inflation have "helped" lead us to this testing mania?

 *  Note:  I changed the title to be more clear.  There is no one simple reason for how we got where we are.

While I can't stand the amount of time, tax dollars, and opportunity costs of all these standardized tests, there is one thing that has always gnawed at me--they are, sadly, somewhat relevant.  Granted, they are way overdone, but as I have told my students in the past, with grade inflation, some--nowhere near as many as what we have*--standardized tests are needed.  This article from Angelina Massoia puts it quite elegantly:

"By submitting to the culture of grade inflation, we empower the standardized test to 'accurately' represent us."

Part of the reason I believe that it's going to be hard to stop this testing madness is that to some small degree, the field of education itself is to blame.  The article above notes that "about 43 percent" of all grades given in college are now an A.  More importantly, it's far worse in education (image source):


This is hardly a surprise, I would hope; I'm sure there's plenty of other, older research (such as this, indicating "Education majors enjoyed grade point averages that were .5 to .8 grade points higher than students in the other college majors.").

Again, I am in no way advocating the testing mania.  What I do believe/wonder is if the "reformers" and other test advocates may have been able to get their grip on our system due to grade inflation and if the Colleges of Education don't share some blame* (whoops, sorry again--I meant "accountability") for this.  This article (again from Huffington) goes over the full study above in more detail.  I take issue with it in that it should have made the distinction between "easy" and "academically rigorous."  I can't say whether they are easy, having not been an ed. major but I am finishing up my 5th ed. course in <1 year.  From this experience--and from what colleagues have told me of their own--I can say education majors are far from academically rigorous.  (I dare say that even my regular chemistry class is far more academically challenging then they have been.  Given the past 10+ years of our leaders' use of "rigor" as a buzzword, I find it rather hypocritical that they probably wouldn't recognize it if they saw it.  Or be able to handle it.  But those are subjects for another blog.)


*  And let's be honest, this is being done to blame teachers and schools.  Oh, wait, I'm sorry, I meant "Hold them accountable [if things go bad, otherwise, the reformers will take the credit]."

Tuesday, November 11, 2014

Opting Out May be Education's Only Option

Hopefully the opt-out movement is (finally...) reaching a critical mass.  Large-scale (200 students) walkouts in Colorado and a recent article in the New York Times highlighting the grotesquely wasteful (ab)use of standardized tests are starting to make waves.  (I would have liked to see the NYT article go into more detail on the opportunity costs--60 to 80 days spent on testing per year!?!--of these tests, as I attempted to here.)  I see opting-out as the only real solution at this point, since both Republicans and Democrats have proven themselves to be utterly worthless (see:  No Child Left Behind, Race to the Top).

1.  Opting out devastates the school's ranking, especially if the good kids opt out.  (I hope/assume that this was encouraged by these high-achieving students and/or their parents.)  This in turn pressures the school board to do something--preferably/hopefully dropping these asinine ranking systems and in turn, their tests.

2.  As more students opt out, the easier it is to see the data being collected is not good data.  Hopefully people start to realize they're spending billions of dollars collecting data that's between dubious and worthless, again pressuring school boards--already strapped for cash--to jettison these tests.

Monday, September 15, 2014

Newsweek and "Cause and Effect"

Oh, Newsweek and your "Best High Schools" list...if only you'd point out the obvious:  The "best" schools are those with the best students.




I imagine that "the list" has become something of a cash cow for Newsweek; there's simply no way they can't realize the list is of no value.

Numbers for Household income from Wikipedia pages.  Chicago was split into several sections so I wasn't sure which MHI to use.  (Not that it matters since the school is a selective admission, as are most, I assume; I didn't bother to check the ones that were well above the US average MHI.)

Monday, August 25, 2014

VAM for Kindergartners? I say VAM for legislators.

The insanity of "big data" has finally hit the mainstream media, as the Washington Post notes that in Florida, kindergartners are technically-sorta-maybe required to take end of course (EOC) exams.  That's right--six-year-olds.  Already the kindergartners politicians are working to cover up their mistake by claiming the law was misinterpreted.  That's much better--because everyone knows you should wait until kids are seven years old before they start taking final exams/EOCs.  Eight at the latest.  (/sarcasm)  I mean, this is common sense here--you can't put peoples' livelihoods in the hands of kids.  Especially when teacher effects are just a small part of picture.  (The numbers appear to vary widely--which makes sense, given difficulties in measuring--but a quick search shows non-school effects range from ~67% to as much as 89% of learning/student success.)

So here is my solution to the problem:  VAMs for our legislators.  If they don't improve our education system, they will be ineligible to run for at least one election cycles.  NO exceptions.  A few key inputs for the model (please feel free to add/modify):
1.  The laws must be shown to have improved education.
2.  The laws must be shown to be cost-effective.
3.  Teacher morale will be factored in.
4.  Graduation rates must improve. This includes elementary and middle school.  (This too will require a cost-benefit analysis.)
5a.  Flexibility in education is key, so how quickly bills can be passed will be taken into account.
5b.  How much support the bill gets in the legislature will also be factored into the VAM.

Who's willing to bet that the instant the politicians realize they can be fired because of things nearly completely out of their control they suddenly have a very different opinion of VAM?  Anyone?  Anyone?

Update:  Gov. Rick Scott calls for review of Florida's standardized tests.  Combined with Arne "Duh" Duncan's latest, there's hope yet...better several-years-late than never to realize that testing mania has gotten out of hand.

Tuesday, April 22, 2014

Was proctoring the FCAT worth 50+ million of your tax dollars every year?

It currently costs $12,700 dollars a year to educate a student in the US public schools.  That works out to about $70/day.  Here in Florida, that number shrinks to $8,887/student, or about $49.40/day (let's just call it $50/day).  Today's FCAT glitch probably just cost a half-day for each student; "thousands" of students were affected.  That's ~$25/student/test that was probably completely wasted.  So are we talking $50,000 of tax dollars down the drain?  $100,000 (4,000 students)?


Is all of this time being spent to collect data (often of dubious quality, to be polite...) a waste and every half-day spent testing is $25/student thrown away?  (And I'm guessing between VAM tests, progress monitoring tests, and FCAT, the average Florida student spends close to a full week of school testing.)   The total opportunity cost caused by lost class time and proctoring these standardized tests must be staggering.

Or, put another way--is proctoring the FCAT worth $50 million?*  Oh, and don't forget Pearson's contract with the state--that costs you another ~$51 million a year.  I have a hard time believing this is worth it; it's probably enough to run a small school district for a year. 



*  There are 2,587,000 students in Florida's public schools.  If that distributes evenly amongst grades K-12, that works out to 199,000 students/grade.  It looks like nearly every grade above 2nd tests (at least once), so that's 10 grades, or ~2,000,000 of the students were tested every year.  So two million tests multiplied by $25/student/test yields ~$50,000,000.  (This is admittedly a pretty rough estimate with some mutant statistics.)