Monday, July 31, 2006

THT Projection Roundtable

Excellent read at THT today. David Gassko gathers the thoughts of several important people in the arena of baseball player projection.

I suspect that Part 2 will include a discussion of whether player projection systems should include "comparable players" and how to make use of that data. It's my take that generating a list of individual comparable players is not in itself useful; while it probably tends to improve projective accuracy with a less sophisticated algorithm, this is because it approximates better and more statistically sound research. It's a problem in that it, as far as I can tell, is done arbitrarily and confuses the samples of a handful of player-seasons for true talent level. Hopefully, a more illuminating discussion than what I've just offered awaits us.

The discussion in the article about the interplay of player role and projection is very important. I don't see why any projection should be pre-programmed to be a compromise that meets the middle ground between different player usages. Projections should demonstrate as broad a spectrum of player usages as possible, assuming that there is significance (that is, assuming that different players will have different values in different roles). While I understand that if you're a fantasy player you'd want to decide on just one number for a player's projection, the best way to do that is to decide the way you think that player will be used, assign probabilities based on that, and combine them with the role-specific projections. Just this weekend I've already demonstrated my irritation over the platoon numbers of two Rangers, Mench and Blalock. In projecting either one, you would want to include not only their results in the past, but also their context; most serious baseball folks have no problem acknowledging the need to park-adjust, but the need to platoon-adjust is just as often overlooked. Blalock, used properly, can easily beat almost any projection out there since they all are based in large part on his awful L-L hitting. Mench can likely do the same, to a lesser extent. (And consider that using a list of 'comparable players' would group together a hodgepodge of players whose future platoon usage could range from Earl Weaver's Orioles to the contemporary Rangers.) And a projection should then not give us one number, but different numbers for RHP and LHP, though the projection for either would be based on the data against all pitchers. Obviously, in most circumstances one needs to combine those figures to generate a projection of value, but projecting the value first and then breaking it down into roles is pointless.

I'd also like to see if the art/science of projecting batter-pitcher matchups can be fleshed out more thoroughly. I realize this is an overwhelming endeavor. But it could probably go a long way toward better deciding the viability of, say, a Barmes/Carroll platoon in Colorado (Barmes against groundballers, Carroll against flyballers). You would expect that pitch-by-pitch data and TLV data would not have a lot of utility in improving the projections for major leaguers (though I'd suspect there's a good chance it can improve projective accuracy for prospects), but using this data to help in-game decisions (and hence roster construction) could be pretty important.


Post a Comment

<< Home