Mirage of Data and Analytics–Baseball Again.

Fangraphs has a fascinating piece that echoes some of my ideas from a post a little while ago.

Dave Cameron starts by pointing to the problems of securing “intellectual property” in baseball:  most people who do analytics are, essentially, mercenaries, who are hired on short term contracts and between different organizations frequently.  You cannot keep them from bringing ideas with them when they change jobs.  So ideas spread rapidly from organization to organization and the opportunity to arbitrage previously underappreciated ideas are reduced.  But he also alludes, without being explicit, to the fact that the ideas and concepts themselves are pretty simple, or at least are given to being interpreted very simply.  In other words, ideas are viewed as commodities that have constant values, rather than something that fits better with a particular philosophy or organization strategy.  To use Cameron’s example, a batter’s swing plane being an uppercut is a “better” approach. To use an example that I often find annoying, FIP is considered a “better” measure of a pitcher’s effectiveness than simple ERA.

Are these in fact “better” measures?  People often don’t seem to realize that FIP does not measure the same thing as ERA at the technical level:  FIP only incorporates the three “true” outcomes–HR, BB, and K’s.  It is probable that a pitcher who gives up many home runs and/or walks many batters is not very good.  But, conversely, there is something to be said if a pitcher who gives up many home runs and walks many batters don’t give up many runs.  Or, indeed, the same thing might be said for pitchers who give up many “unimportant” runs (i.e. give up runs only when it doesn’t count–and somehow, manages to persistently keep leads, even small leads).  It could be that FIP might, on average, capture the “value” of a pitcher better than ERA, which, in turn does a better job than simple wins and losses, but I don’t think the value of a player is a simple unidimensional value that always translates to a real number readily.  Conditional values of a pitcher varies depending an organization’s strategy and philosophy, and these are more difficult to change–but also offer the potential of finding more lasting value than the easier, commodifiable statistics.  The optimum strategy, in a high variance matching game, is to know your own characteristics (i.e. philosophy, approach, endownments in budget and talent pool, etc.) and optimize conditional on those characteristics–and sign on especially those that don’t fit other organizations’ characteristics neatly.  Universally good traits are easily identified and their value competed away fast, now that technology is readily available.

Much had been made of the Royals’ success in seemingly going against the grain, with regards “analytics.”  Now, several authors claimed that the Royals were in fact making good use of moneyball concepts, focusing on the traditional but still valuable ideas that have been neglected due to sabermetric fetish.  I think both are somewhat mistaken:  I suspect that the Royals began with a philosophy first and tried to incorporate statistics to fit the philosophy, not bounce around “analytics” chasing after the fool’s gold of commodified “good” stats whose value dissipates rapidly.  Copying the Royals’ approach, without having similar basic philosophy and organizational strengths and weaknesses, probably will not pan out.  Building the philosophy and style–and assemble personnel who appreciate them–is a long term process that requires, ironically, a deeper appreciation of what analytics do and don’t offer–specifically, the subtle differences between the many seemingly similar stats and how they mesh with the particulars of the team in order to find better “matches.”

This is hardly a new idea in business management:   in 1980s, as per this TAL story, GM execs were puzzled that Toyota was so willing to reveal the particulars of its management strategy to its competitor in course of their joint venture.  It turns out that Toyota’s management strategy is effective given the organizational philosophy of the firm and turned out to be very difficult to implement in GM without upending its fundamental characteristics.  It does seem that Toyota did overestimate the import of “Asian culture” as a component of its corporate philosophy, as GM was reasonably successful, over the (very) long term, in implementing many of the lessons it learned from Toyota–but most of these successes came in overseas subsidiaries far from the heart of GM’s corporate culture that impeded their implementation.  Perhaps this provides a better explanation of the much ballyhooed feud between Mike Scioscia and Jerry DiPoto that eventually led to the latter’s departure.  I don’t think Scioscia and the Angels organization have been necessarily all that hostile to the idea of “analytics” per se–they seemed to have interesting, quirky, and often statistically tenuous ideas about bullpen use and batting with runners in scoring position dating back to their championship year in 2002 at least. So a peculiar organizational culture already existed that could absorb analytical approach of certain strains but potentially hostile to others, and I wonder if what showed up was this, rather than “traditional” vs. “analytical” as commonly portrayed.

Here, I speak from personal experience:  I looked enough like a formal modeler to be mistaken for one by non-formal modelers, but I usually started from sufficiently different and unorthodox assumptions that I did not mesh with a lot of formal modelers who either did not understand that their assumptions need not be universal or were hostile to different ideas in the first place.  I will concede that, on average, the usual assumptions are probably right most of the time–but when they are wrong, they are really wrong, and a great deal of value resides in identifying the unusual circumstances when usually crazy ideas might not be so crazy.  Of course, that is why people, not just baseball teams, should take statistics and probability theory more seriously when they delve into “analytics.” Nevermind if stat X is “better,” unconditionally.  Is stat X more valuable given such and such conditions than stat Y, and do these such and such conditions apply to us more than those other guys?

PS.  This is the repeat of the story on beer markets and microbreweries, in a sense.   Bud Light is a commodity beer that seeks to fit “everyone” universally.  Its fit to any one market is imperfect, but, given the technology on hand, it can be produced much more cheaply than most beers that fit a smallish market better.  Only beer snobs are so willing to trade off much higher price for better fit in tastes.  This is independent of the methodological problem of identifying the fits–the question is, once you identified the better fit, how many people are willing to pay the price for the better taste?  But technological change forces a reconsideration of this business model:  microbrewery revolution was preceded by technological change that made production of smaller batches of beer much cheaper.  Producing massive quantities of the same taste is still cheaper, but the gap is much narrower.  Less snobby beer drinkers will pay a smaller premium for better taste fit.  So the problem is much more two-dimensional (at least) than before:  you find the better taste fit, and conditional on the taste fit (and the associated elasticities), try to identify the profit maximizing price.  This requires a subtler, more sophisticated strategy and analytical approach and is liable to produce a much more complex market outcome.  As noted before, people who are more sensitive to price than taste will still gravitate towards Bud Light, even if there is a taste that they prefer more, as long as the price gap is large enough.

With baseball (and indeed, all other forms of “analytics,”), the problem is the same.  FIP or SIERA or any other advanced statistics are still in the realm of commodity stats, something that is supposed to offer a measure of “universal” value.  If you will, these are the means to produce a better Bud Light.  But soon enough, Bud Light is still Bud Light.  It is not easy to find something that suits everyone that much better.  So you trade off:  you give up the segments of the market that have a certain taste for another segment that you can cater to more easily.  Or, in baseball context, you grab the players who may not be so good, in the overall sense, but whose strengths and weaknesses, whether quantifiable or not, complement your organizational goals and characteristics better, with the caveat that, even if they are quantifiable, the measures will be more complex than simple commodity stats like ERA or FIP, in that their usefulness would be conditional.  Perhaps one could come up with some sort of “fitness” or “correspondence” stats (incidentally, online dating services use this sort of stats–and this has long history of its own:  the “stable marriage problem” is one of  my favorites and is foundationally linked to the logic of equilibrium in game theory (and my research interest for years had been on “measuring” the stability/fragility of equilibria (Which, in a sense, is a paradoxical notion–if it’s not stable, how can it be an equilibrium?  But the catch is that most things are in equilibrium only conditionally–this is the core of PBE notion:  an outcome is stable conditional on beliefs that are justified by the outcome, i.e. a tautology.  If people, for whatever reason, don’t buy into the belief system, it may fall apart, depending on how many unbelievers there are.).


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s