Thursday, February 18, 2010

You can't be serious


Readers of this site know that not too long ago Baseball Prospectus released its PECOTA projections for the 2010 end-of-season standings. While the CAIRO and CHONE projections had the Yankees finishing as the top team in all of baseball, BP originally forecast the Bombers to miss the playoffs entirely.

The Yankee blogosphere didn't let this slide even for one second, and SG from RLYW immediately found a flaw in BP's methodology and wrote about it. Over here at Yankeeist, we poured through BP's numbers and explained the flaws we saw here, here and here. (Many other blogs would have found this to be an onerous use of resources, but Yankeeist employs a crack team of statisticians.)

BP responded to the criticism and called mulligan, updating PECOTA not once, but twice. On the first update the Yankees wound up tied with the Red Sox atop the AL East. On the second update BP got with the program and put our boys back on top -- until recently.

It went by with little fanfare, but PECOTA has been updated for a 4th time! (FOURTH! We haven't even played a game or made a trade!) And, as if you couldn't see this coming, the Red Sox are now predicted to finish two games ahead of the Yankees, who are seen just squeaking by the Rays.

The BP projections bother me because no games are being played yet. As the season draws closer I'm becoming more anxious to see actual baseball, but I'm also becoming more and more hungry for a 1998-style defense of the 2009 title. BP is raining on my parade.

That's why I took a fine-tooth comb to their projections earlier. (For those who don't know, my first-ever post on Yankeeist took BP to task for missing badly on its projection of the 2009 Yankee offense.) Nevermind that it smells like a rat anytime a single projection system projects something different from all the others. I want to be supremely confident that the Yankees are going to win 120 games this season, at least until we go 14-17 in April. BP is a chink in my psychological armor.

BP's new standings can be found here, and with a subscription you can see their forecasts for the Yankees here. My biggest criticism with the projections at this point is that they've been updated 4 times. Which prediction is correct? Why so many updates?

Beyond that, perusing the numbers quickly, my previous comments stand. BP is bad at predicting superstars and aging players, and the Yankees have both. As a result, even though the 2009 team scored 915 runs BP projects the 2010 team to score 821 runs, nearly 100 fewer. That number seems low, but PECOTA is conservative (bad?) when it comes to offensive projections.

The real difference between the Yankees and Red Sox projections are their pitching staffs. PECOTA simply does not like some of our pitchers. Mariano Rivera is now projected to have a decent season (earlier projections have Mo putting up his worst ever numbers), but PECOTA is down on Andy Pettitte, A.J. Burnett and Phil Hughes. All three are projected to regress from last season; Pettitte is seen posting a 4.71 ERA. Joba Chamberlain, meanwhile, is seen having a solid season, but only pitching 138 innings. Javier Vazquez, on the other hand, is projected to be nearly as dominant as CC.

For my part, I believe the CHONE, CAIRO and MARCEL projections. Not only do I find BP's general status as an outlier hard to believe, but I also don't buy the level of regression they are predicting for so many Yankees. They forecast such regressions last year and got it wrong. When they get it right this team is pretty much over because the entire core four will have lost all its value. I'm not there yet. That said, all of this is just posturing until they start playing the games.

6 comments:

  1. Don't get me wrong, I'm really enjoying advanced stats, but sometimes, its just an exercise in futility. if somebody covered a blackboard with calculations that explained how an elephant could hang by its tail tied to a daisy over a cliff, i'd have to dismiss their theory as impractical and theoretical. i dont have to be a physicist to understand how nuts that is. I'm using my eyes, my common sense. The Yankees and Red Sox both improved their rotation. They both improved their defense. The Sox let their offense slide a bit, but the Yankees offensive difference between 09/10 is looking insignificant right now. i dont see any significant regressions on either roster (like Damon's defense from 08 to 09), so I don't see an issue on that front. i can't for the life of me understand why they have Hughes taking such a dive.. is it bc they project him in the rotation? their numbers just don't add up for me - but then, i dont really take the trouble to do the arithmetic bc i dont have to - i use my common sense!
    1. Yankees about 100 wins
    2. Red Sox about 95 wins
    3. Rays about 93 wins

    ReplyDelete
  2. OMG WE LOST THE SPREADSHEET AL EAST WHY EVEN PLAY THE GAMES?!?!?!

    Seriously, the latest PECOTA predictions are incredibly screwy. No team outside of the AL East has 90 wins, but there are three teams in the AL East that do. No team in the AL Central has a freakin' winning record. IIRC, the Angels have a pretty severe losing record. Yeah, they got worse. It's no longer "the Angels, and everyone else" in the AL West. But they didn't go from being a nearly-100-win team to a below .500 team.

    I sure as hell wouldn't mind Javy being about as good as CC, though.

    ReplyDelete
  3. I did not look at their predictions for the AL Central. If they have no team in the AL Central with a winning record then their projection system is seriously flawed.

    BP is predicting that many Yankees players will get worse with age - subtraction from inaction, if you will.

    Until there is evidence of that, I find it hard to believe. More to the point, I struggle to put faith in a system they've updated 4 times and wonder why they are doing it.

    ReplyDelete
  4. Yeppers: http://www.baseballprospectus.com/fantasy/dc/

    As you can see, the Twins, White Sox, and Tigers will all go 80-82... and be tied for first in the division.

    And I'm sorry, if people are healthy (PECOTA doesn't take health into account, right?), the Mets aren't going to be THAT bad. They should be a .500 team. Not a whole lot better, but they should be at .500.

    There's also only a 23-game difference between the best record in baseball and the worst (the Blue Jays, BTW. I do agree they'll finish last in the division, but I don't think they'll have the worst record in baseball.). In the Wild Card era, I did the math, and the average difference between the best record in baseball and the worst is over 44 games. The lowest it's ever been is 30, and that was a pretty big outlier. Yeah.

    ReplyDelete
  5. FWIW, I've found yet another a flaw in the latest re-run of their PECOTA standings. You can read the whole thing at the bottom of this post, but long story short, the runs scored/allowed in their standings page don't line up with the AVG/OBP/SLG listed on the exact same page. Until they fix those to line up, these are essentially worthless.

    ReplyDelete
  6. I'm really surprised to see all of the problems with PECOTA. This is a well known, respected tool, but updating the projections 4 times, making mistakes consistently, and putting out shoddy projections such as everyone in the AL Central having losing records, is just sloppy work.

    ReplyDelete