Baseball and Big Data: How Statistics and Analytics Are Changing the Game

by Diana Drake
A baseball player in a red cap and striped jersey holds a bat, poised to swing. Graph lines are overlaid on the blue background.

Spring training is underway in Florida and Arizona, and the first games start in March. But the decks weren’t fully set until recently, with a couple of big-name free agents still looking for a team. (A free agent can sign with any club or franchise, typically because their earlier contract has expired or they are yet to be drafted.)

In the past, statistics and analytics were being blamed for top-ranking third baseman Manny Machado and right fielder Bryce Harper not landing deals until late February. As it turned out, Machado signed a 10-year, $300-million deal with the San Diego Padres, a record for a free agent. Machado didn’t fare badly; his latest deal is the third highest for any individual sporting contract, and ranks behind Giancarlo Stanton’s $325 million deal with the Miami Marlins in 2014 and boxer Canelo Alvarez’s $365 million arrangement with sports broadcaster DAZN, according to a CNN report. Harper, too, signed a 13-year megadeal with the Philadelphia Phillies.

Advantage Analytics

As with most other industries, data analytics is becoming the litmus test for big deals in professional baseball as well. “The analytics group has made its mark,” said Wharton statistics professor Abraham (Adi) Wyner, who is also chair of the undergraduate program in statistics. He is also a host of the Wharton Moneyball program on Wharton Business Radio on SiriusXM.

Wyner drew a parallel between how valuation is done for corporate M&A deals, keeping in mind the net present value of the future cash flows of acquisition targets. “What we assume that the teams should know, but never seem to get, is that you’re paying for the future, not the past,” he said. “Historically that seemed to be what people did, because statistically, people would look at the past and they would project the future by just dragging out the past. That is just not the right way to do it. The data available today has made it better and easier to forecast the future.”

Major league baseball has had “two years in a row of an extremely slow market,” said Brendan Harris, a retired professional baseball infielder with teams including the L.A. Angels and Minnesota Twins. He is currently signed on with the Los Angeles Angels for player development. “There are many reasons for the slow free agency and the lack of signings, specifically analytics. Smarter teams do not want to commit to these long-term deals. And the players are starting to get pretty frustrated.”

Wyner and Harris discussed the influence of analytics in professional baseball on the Knowledge@Wharton radio show on SiriusXM.

Analytics could help prevent bad deals, Wyner suggested. “Historically, we in the statistics departments have been banging our heads against the wall as fans as we watch our favorite teams make ridiculous deals,” he said. “They tend to be very long-term contracts that tie the team up in money for players who aren’t going to perform.”

The use of analytics in baseball star-spotting will be “the new norm,” according to Wyner. “Analysts have made their mark” in cautioning owners of teams from entering into “bad contracts” that tie up their budget into one player “who just isn’t going to be valuable in the way you expect.” He said that while players like Harper and Machado “are very good players, it’s not clear that they’re worth $400 million over 12 years.”

Focus on Productivity

Another factor that owners of teams have to consider is the advancing age of players. Machado and Harper, for example, are 26 now, and it may not make the best sense to bank on them for more than five or six years, by which time they will be 31 or 32 years old, Wyner said. “We’re seeing less production from the middle agers, which is [age] 33 and up, relative to the younger guys.”

Harris said analytics supports the case that players wane in their performance as they age. “Once [the players are hitting 32 years], they’re starting to regress in their performance,” he added. “The premium price is being baked into the previous production, and it’s not guaranteed that you’re going to get that performance in the future. Teams are just not going to pay for such an unknown once they regress into those later years.”

The valuations of baseball players have also undergone changes with shifts in the importance attached to specific skills. For example, “teams are seeing enormous value with pitchers in particular that they never really gave much thought to [earlier],” said Wyner. “So, they’ve redirected their effort in that area. The other effort is in things that we used to not be able to quantify and didn’t care about, and those in particular are fielding, base running and stealing.”

“There are many reasons for the slow free agency and the lack of signings, specifically analytics. Smarter teams do not want to commit to these long-term deals.” — Brendan Harris

Fielding especially is a major skill, Wyner continued. He pointed to Mike Trout, center fielder with the Los Angeles Angels, and said he’s a favorite “because he’s an exceptional fielder and an exceptional base runner and an exceptionally intelligent base player.” Those qualities lift his attractiveness for team owners even if he doesn’t have stellar statistics in home runs, he noted. On the other hand, players like Harper and Machado “are not particularly good fielders and not particularly good base runners,” Wyner said. “By the time they’re 32, they’re going to be albatrosses in your lineup. They’re going to be less productive at the play.”

Team owners have begun adopting “a portfolio view,” where instead of putting all or most of their budget on one or two players, they spread it across three or four players, said Harris. “That mitigates the risk.”

The portfolio approach and caution with budgets are evident in the spring training season. “[Teams are] going to try to tread water and then maybe make a deal, and they’re not going to spend that money” on a single high-performing player, said Harris. “You’re seeing probably eight to 10 teams with really low payrolls that are building for the future.”

Changed Business Model

Harris also saw what he called “a big doughnut hole” in the professional baseball market. “Teams are not tied to attendance [at game venues] anymore; they’re tied to their TV deal,” he said. “So, there’s no onus on them to spend that money to get to 80 [runs] when they could get those hypothetical high first-round picks coming next year.” As it happens, attendance at Major League Baseball games has been declining. Attendance in the 2018 season dropped 4% over that in 2017 to below 70 million for the first time, according to a Forbes report on official statistics.

Wyner agreed that the MLB business model has changed with teams giving more importance to TV deals rather than to attendance numbers. But he didn’t appreciate the logic of that shift. “I don’t quite get it,” he said. “I love to go to the ballpark. Bryce Harper to me is a draw. I’d love to go see him play. Machado also particularly does it from the infield. These guys have got to bring people in and I don’t know what the business calculation is. It’s hard to imagine that bringing people to the ballpark doesn’t matter. I mean it does bring some revenue.”

As MLB teams introduce more analytics in their valuation of players, they also have to revisit how they structure the tenure of contracts. Wyner noted a “lengthening of the prime” with top-rated players in tennis and some other sports. However, in baseball, he saw “a regression” where players burn out earlier. Better nutrition and training don’t seem to have lengthened the careers of baseball superstars, he noted.

Meanwhile, younger players are performing better and moving up the career ladder faster than in earlier years. “There’s been enormous amounts of more production by the youngsters than there ever was,” Wyner said. Harris agreed and pointed to “a greater focus on player development.” Teams are moving away from the old-school wisdom that a player needs to account for a minimum number of innings before being promoted to the major leagues. “They’re now saying that if a guy’s successful for two months, move him. You’re seeing some guys get promoted when they are 20 or 21 years old,” he said.

Another way team owners try to spread their bets is by signing on players for shorter tenures than earlier, and some of them can be as short as one year. “If owners can get away with one-year deals, they’re going to do it,” said Harris. Players who would earlier get three- or four-year deals now have to settle for one, two or three-year deals, he added. “Hopefully it’s just baseball’s basic supply and demand. I think these players can get those deals back, but as of now I don’t see it coming.”

The demand-and-supply mechanism is at work in other ways, as well. The market now has fewer teams than in earlier years, said Harris, implying that players have reduced bargaining power. Also, unlike in years past, few owners bring flamboyance to their deal signings. “Years ago, we used to see owners [wanting] to make a splash offseason and throw some money around,” he added. “You’re just not seeing it anymore. These teams aren’t valuing those intangible things like they used to.”

Wyner noted the absence these days of “big personalities in ownership” like the late George Steinbrenner, who owned the New York Yankees. “Steinbrenner for years set the stage,” he said. Now, without such towering personalities to lead them, team owners also tend to follow one another, he said. However, he did not think they behave like cartels. “It’s not collusion,” he said. “It’s a collective wisdom that is accumulated and it’s spreading.” That collective mindset, egged on by hard data and analytics, has led them to question the value of top-rung players, he added.

A Correction in Sight?

The declining importance of free agents “is hurting the fan base a little bit,” said Harris. “There is merit to look to these free agents. Maybe it does hamstring your eight- or 10-year deal but you can get these marquee free agents to reenergize your franchise.”

Harris is optimistic that teams will find value in star players even as their prime years recede. “I think that the pendulum will swing back,” he said. Positions such as catching, for example, are valued as players get older.

Of course, analytics can help only to a certain degree. “In order to win a championship, it’s going to cost you more and you have to recognize it; you have to pay for it,” said Wyner. “A team who wants to win should be able to find the money and cough it up.”

Related Links

Conversation Starters

What influence is analytics having on professional baseball?

How is the MLB business model changing?

What does Brendan Harris mean when he says, “The premium price is being baked into the previous production, and it’s not guaranteed that you’re going to get that performance in the future.” How is big data helping to tell the story of player valuation?

2 comments on “Baseball and Big Data: How Statistics and Analytics Are Changing the Game

  1. As computerized modeling and mathematical analytics have advanced, more and more professional sports organizations are utilizing big data to categorize and strategize investment decisions. In the context of the sports world, these investment decisions are otherwise known as player contracts. Team owners and front office staff intricately scrutinize every contract they dole out; making a successful signing could be the difference between experiencing exponential valuation growth by electrifying the fan base and descending into years of salary cap purgatory and half-empty stadiums.
    Personally, I am a fanatic NBA fan—I love watching Stephen Curry weaving in and out of defenders, splashing threes, and Giannis Antetokounmpo eurostepping into a dunk. Even though I thoroughly enjoy watching basketball games, to me, the most intriguing portion of the NBA is free agency. Every summer, there is a frenzy of transactions: teams trading star players, signing marquee free agents, and vaulting into the luxury tax by overpaying their players. While front offices during the times of lore (early to late 20th century) largely relied on basic metrics like PPG (points per game) and RPG (rebounds per game), modern-day statisticians have developed more advanced measures to evaluate player performance. Back in the 1900s, players like Wilt Chamberlain, who scored 100 points in a game, and Bill Russell, who grabbed 51 rebounds in a game, were the ones who were awarded with extravagant contracts, but nowadays, NBA athletes like Draymond Green (who averaged a meager 7.4 PPG in the 2018-2019 season) are receiving $100 million contracts. The exceptionally volatile change in how the NBA values players prompts one essential question: exactly how are NBA athletes who barely pass the eye test in basic metrics convincing teams to consign them to a huge payday?
    The answer lies in the advanced metrics. Let’s take a look at two current NBA players and how their basic and advanced metrics match up. Player One is Devin Booker, a point guard for the Phoenix Suns who averaged 26.6 points per game during the 2018-2019 NBA season and famously scored 71 points in a game. Player Two is Draymond Green, a power forward who the Golden State Warriors who averaged 7.4 points per game last year. From purely statistical point of view, the more valuable player is Booker, but the advanced metrics tell a different story. Green has a career win share (measure of the number of wins contributed by one player) of 43.3 according to Basketball Reference, while Booker’s is only 9.8. Advanced metrics like career win shares are more accurate indicators of how valuable a player is. After all, Booker’s team has never made the playoffs, while Green’s Warriors have advanced to the NBA finals five years in a row and won three championships during that period of time. I full-heartedly agree with Wharton statistics professor Abraham (Adi) Wyner when he says “the analytics group has made its mark”. When basic metrics are blind to the merit of players like Draymond Green, analytics have shown how valuable players actually are, thus also awarding them with large contracts.
    Later on in the article, Professor Wyner goes on to emphasize that “What we assume that the teams should know, but never seem to get, is that you’re paying for the future, not the past.” Through this statement, Professor Wyner brings up a crucial flaw in relying on analytics. Past performance, however eye-opening provides little indication to future athlete performance. There are injuries, accidents, and other scenarios that analytics simply cannot predict. Sometimes, players break out or experience an unprecedented period of improvement. Analytical modeling can be a useful pillar in determining contracts to award to players, but cannot be the only factor. This is especially true for baseball athletes in the MLB. As evidenced by Manny Machado’s 10-year, $300-million deal with the San Diego Padres and Giancarlo Stanton’s $325 million deal with the Miami Marlins in 2014. MLB franchises tend to allot exceptionally profligate contracts over exceptionally long periods of time. While a player is 27 or 28 and at peak capacity, their performance might be awe-inspiring and calling for a huge contract, but the value of a player performing at peak capacity is drastically different than their value during a declining age 39 season. For players like Machado and Stanton, they will still be on their $300 million-plus contracts well into their late 30s and early 40s, even if their performance does not match their payday. Like Brendan Harris states, “the premium price is being baked into the previous production, and it’s not guaranteed that you’re going to get that performance in the future,”
    Even though some players might be analytical darlings, it is important to recognize and anticipate how their value might change over time. In that sense, player contracts are much like investing in the markets. Some companies flaunt promising futures and their analytics support growth, but they suddenly devolve and lose investors a lot of capital. There are also other factors involved (just like in player contracts): leadership, market viability, and demand. Another parallel to draw from market investing is the concept of diversification. Instead of paying one star player all so much money, team owners should diversify their monetary resources amongst several promising players. If there is one piece of advice I could give to the owners of MLB, NBA, and NFL team owners, it is this: always use a holistic perspective in evaluating prospects and deciding how large of a contract to allocate. Analytics accurately demonstrate past accomplishments, but are in no way indicative of future success.

    • As computerized modeling and mathematical analytics have advanced, more and more professional sports organizations are utilizing big data to categorize and strategize investment decisions. In the context of the sports world, these investment decisions are otherwise known as player contracts. Team owners and front office staff intricately scrutinize every contract they dole out; making a successful signing could be the difference between experiencing exponential valuation growth by electrifying the fan base and descending into years of salary cap purgatory and half-empty stadiums.

      Personally, I am a fanatic NBA fan—I love watching Stephen Curry weaving in and out of defenders, splashing threes, and Giannis Antetokounmpo eurostepping into a dunk. Even though I thoroughly enjoy watching basketball games, to me, the most intriguing portion of the NBA is free agency. Every summer, there is a frenzy of transactions: teams trading star players, signing marquee free agents, and vaulting into the luxury tax by overpaying their players. While front offices during the times of lore (early to late 20th century) largely relied on basic metrics like PPG (points per game) and RPG (rebounds per game), modern-day statisticians have developed more advanced measures to evaluate player performance. Back in the 1900s, players like Wilt Chamberlain, who scored 100 points in a game, and Bill Russell, who grabbed 51 rebounds in a game, were the ones who were awarded with extravagant contracts, but nowadays, NBA athletes like Draymond Green (who averaged a meager 7.4 PPG in the 2018-2019 season) are receiving $100 million contracts. The exceptionally volatile change in how the NBA values players prompts one essential question: exactly how are NBA athletes who barely pass the eye test in basic metrics convincing teams to consign them to a huge payday?

      The answer lies in the advanced metrics. Let’s take a look at two current NBA players and how their basic and advanced metrics match up. Player One is Devin Booker, a point guard for the Phoenix Suns who averaged 26.6 points per game during the 2018-2019 NBA season and famously scored 71 points in a game. Player Two is Draymond Green, a power forward who the Golden State Warriors who averaged 7.4 points per game last year. From purely statistical point of view, the more valuable player is Booker, but the advanced metrics tell a different story. Green has a career win share (measure of the number of wins contributed by one player) of 43.3 according to Basketball Reference, while Booker’s is only 9.8. Advanced metrics like career win shares are more accurate indicators of how valuable a player is. After all, Booker’s team has never made the playoffs, while Green’s Warriors have advanced to the NBA finals five years in a row and won three championships during that period of time. I full-heartedly agree with Wharton statistics professor Abraham (Adi) Wyner when he says “the analytics group has made its mark”. When basic metrics are blind to the merit of players like Draymond Green, analytics have shown how valuable players actually are, thus also awarding them with large contracts.

      Later on in the article, Professor Wyner goes on to emphasize that “What we assume that the teams should know, but never seem to get, is that you’re paying for the future, not the past.” Through this statement, Professor Wyner brings up a crucial flaw in relying on analytics. Past performance, however eye-opening provides little indication to future athlete performance. There are injuries, accidents, and other scenarios that analytics simply cannot predict. Sometimes, players break out or experience an unprecedented period of improvement. Analytical modeling can be a useful pillar in determining contracts to award to players, but cannot be the only factor. This is especially true for baseball athletes in the MLB. As evidenced by Manny Machado’s 10-year, $300-million deal with the San Diego Padres and Giancarlo Stanton’s $325 million deal with the Miami Marlins in 2014. MLB franchises tend to allot exceptionally profligate contracts over exceptionally long periods of time. While a player is 27 or 28 and at peak capacity, their performance might be awe-inspiring and calling for a huge contract, but the value of a player performing at peak capacity is drastically different than their value during a declining age 39 season. For players like Machado and Stanton, they will still be on their $300 million-plus contracts well into their late 30s and early 40s, even if their performance does not match their payday. Like Brendan Harris states, “the premium price is being baked into the previous production, and it’s not guaranteed that you’re going to get that performance in the future,”

      Even though some players might be analytical darlings, it is important to recognize and anticipate how their value might change over time. In that sense, player contracts are much like investing in the markets. Some companies flaunt promising futures and their analytics support growth, but they suddenly devolve and lose investors a lot of capital. There are also other factors involved (just like in player contracts): leadership, market viability, and demand. Another parallel to draw from market investing is the concept of diversification. Instead of paying one star player all so much money, team owners should diversify their monetary resources amongst several promising players. If there is one piece of advice I could give to the owners of MLB, NBA, and NFL team owners, it is this: always use a holistic perspective in evaluating prospects and deciding how large of a contract to allocate. Analytics accurately demonstrate past accomplishments, but are in no way indicative of future success.

Leave a Reply

Your email address will not be published. Required fields are marked *