Sunday, July 06, 2008

Let's Compete with BP

Arguably continuing the theme from my previous post, the Baseball Prospectus home page proudly displays a photo of Michael Young along with the caption, "Texas shortstop Michael Young is just one of many Rangers who've earned an All-Star selection." This is but one of the many reasons I have no plans to sink another dime into their operation, Silver's fine work notwithstanding. I will concede that there simply aren't any AL shortstops that have earned an All-Star selection this season, but Young has had his worst hitting season so far, is no longer a very good hitter (he is still good relative to his position), and at least by UZR continues to flash some relatively awful range. Why would BP showcase one of the least accomplished and least deserving players on the All-Star team? He's clearly less deserving than the other of the "many Rangers" (Bradley, Kinsler, Hamilton). I'm sure the article has some specious argument for him, probably related to DT fielding runs. BP is a news gathering outlet now, but as far as I can tell they aren't doing that particularly well (I guess I wouldn't really know, but among other things Kevin Goldstein published a mock draft the day before the draft where he successfully predicted only two of the first round picks), they are no longer anywhere near the cutting edge in statistical analysis and don't have an especially good setup for stats presentation, and their prestige pieces are utter garbage.

And yet, none of the major outlets have tried to bankroll something to truly compete with Baseball Prospectus. They've all done it piecemeal by hiring one or two proven sabermetrically-oriented writers. Meanwhile, THT and, as the great and universally-affiliated Eric Seidman calls it, "The Blog" blow BP out of the water. But because of its chosen path of institutionalization, BP wields much greater influence in the industry. It very obviously doesn't have to be this way.

So here is my idea, which obviously suffers from me not having any business background or caring about such things. BP has learned that news gathering is an essential part of their financial framework, and that makes perfect sense. BP's major problem (as far as people like me are concerned) is that they try to get all of their writers to cram BP numbers into their articles (or they have been hiring writers that are simply inclined to do so).

So I propose a website that develops its own framework for baseball research. That is, the researchers are trained to use the same toolbox and basic presentation. Uniformity in numbers and presentation across writers is good, and BP doesn't coordinate its writing or research enough to make thier site even come close on this measure. Moreover, BP (IMO) hurts itself by keeping their formulae proprietary; nobody at home wants to follow up on their claims since we can't replicate the studies. Hence, I propose a flip - this new site would not attempt to publish stats (this is a money saver) but would make all of the methods open source. That way, the individual articles do not need to dwell so much on methodology. In addition, the history of methodological changes will be well-maintained and people can keep track of the changes in rationale over time.

So my idea is to get a team of a handful of outstanding baseball researchers to establish two frameworks. The first is a framework for projecting minor leaguers based on minor league numbers, and the idea would be, in addition to finding the best way to project near-term performance, peak performance, and player value, you would want to find the best metrics to concisely convey information about a player's numbers (think of Pizza Cutter's work on Speed Scores and Power Scores, for instance) - that is, to find the best ways to slice the numbers to make a variety of points and then allow the readership to be familiarized with them. FIP and other stats have eventually taken off because of THT, but the market is still limited by using numbers that are easily compatible with well known formulae (that is, they replicate ERA or batting average, etc.). I would want a site that has the freedom to design new ways to conceptualize and interpret baseball statistics without being beholden to the old forms, but one that works hard to clarify the scales and the hermeneutics - BP most certainly does not do this.

The second framework is to study the correspondence of scouting reports and statistical performance. Ideally, it can be established what the normative correspondence between qualitative scouting assessments and quantitative performance is, although obviously the error bars are unlikely to ever shrink much. In other words, we'd like to be able to get a scouting report (and we want a framework that can interpret a detailed scouting report from an actual scout in addition to a throwaway Baseball America comment) and show what players with similar scouting reports have actually done. I'm tired of junk like this where player comparisons are bandied about based on the eventual results rather than the skills that are observed. What we want is to be able to say, "In general, when scouts say x about a player, the player is this good, and the better players broke out because of x, y, and z, the lesser players failed because of o, p, and q, and typical players like G, H, J, and K were assessed quite accurately."

This would be a daunting task, and the obvious counter-argument is that this is what the teams themselves are trying to do and anyone able to do it well should get a job with a team. However, teams compensate their researchers more through prestige than cash, and if such a site were successful it could well be much more lucrative and would very likely put its research team in a position to get better jobs from MLB than they could otherwise manage.

With these frameworks in place, the site itself can function with a team of 3-4 research directors leading some assistant researchers, writers, and editors, and could seriously pump out the content. Any time a minor leaguer gets called up a level, demoted, traded, moved to a different position, gets a PR blast, etc., the site would be on top of it and ready to go. A player report could be relatively easily compiled showing a quantitative projection, the player's implied quantitative projection based on qualitative assessments, interesting data from players with similar numbers and from players with similar scouting reports, and a comparison to peers or other relevant players. So such a site could combine top-notch and intensive research with an RSS-friendly approach and frequency. At present, baseball sites go for volume in available statistics and quality in articles; I think a site that keeps the article quality while providing a volume of articles can become a pretty big phenomenon.

Furthermore, such a site would do well in maintaining broad interest since there would be features on prospects from each organization, so it would be relatively easy for fans to subscribe to posts on just their teams prospects (and rumored trade targets, etc.). Additionally, the level of fan/reader involvement should be high, since the method is open source but not all of the numbers are: enterprising readers can conduct the research on their own and post it in the comments, and in the process a farm system for bringing in new researchers can emerge; conversely, average readers will always want to know more about the players they are interested in and will frequently be suggesting posts (see Sickels, John).

Because of the focus of the site, I think it would have the crucial angle of appealing to baseball writers and to the major league teams (and minor league teams, for that matter), so its inside influence could expand rapidly. A new site about major leaguers would need to be exceptionally good to gain traction. A new scouting site about minor leaguers would similarly need to be outstanding to compete. But while the site I'm proposing would have to be outstanding, the distance from outstanding work to recognition and influence seems fairly short in comparison. And this means that journalistic access would develop very quickly, leading to a positive feedback loop (for a while, anyway).

I can't say I know that the site is doable, but I think it is. The major obstacle is simply the start-up cost, as getting enough data together to do it right is cost-prohibitive and getting enough research work together is time/cost-prohibitive. Simply put, the payoff is too far away for it to work out without someone to bankroll it; the key is to establish the frameworks before going live so that after that you just need to put in the work/grind. But I do believe that such a site could quickly become a juggernaut. It would require patience and money. Does Google want to own a baseball site?

I am certainly not sure that I would want to spend my time on this endeavor, but I believe it is a well-considered framework and would be interested to hear from you if you're interested in it.


Post a Comment

<< Home