By: Scott The Statissassin
The topic of statistics in baseball often spawns two camps, one lauding wins and losses supplemented with more traditional statistical measurements (batting average, errors, saves, etc.) while the other side promotes complicated, vague, and often caps-lock heavy acronyms (BABIP, SIERA, and WAR to name a few). Ironically, these two sides are often summarized as loving or hating stats when the real breakdown occurs over WHICH stats are the most meaningful or useful. SPOILER ALERT - Wins and Losses are stats too! Despite finding themselves in opposition, the ultimate goal of both groups remains the same --
Part of the difficulty surrounding “advanced analytics” and the average fan comes down to accessibility. Every fan knows what RBI stands for and how it is calculated. Historically, traditional stats were printed in newspaper box scores or available on the back of baseball cards. If you played baseball as a kid, your probably received a sheet of paper during or after a season showing your own stats for that year. Analytics? Not so much. Since the Bill James revelations of the 80’s and the later popularity of Moneyball, analytics have moved increasingly to the forefront. Often available for free online (Fangraphs, Baseball Prospectus, Baseball Reference for starters), newer generations of fans can find themselves sorting tables of all kinds of stats they only heard of this morning.
Pitching
People STILL put a lot of emphasis on a pitcher’s win-loss record. I (kinda) get it - the point is to win games. The problem? Wins are a team statistic and not all wins are created equal. You pitch 6 innings and give up 6 runs? Your lineup can rake so you get a “W.” The guy that threw 8 innings of 1 run ball? Sorry, you have a AAA lineup and you gotta hold that “L.” More ideal pitching statistics measure what is actually in the pitcher’s control. A relatively well known example of this is Walks + Hits per Innings Pitched (WHIP) which simply sums up how many walks and hits a pitcher allows and divides by innings pitched. Expected Fielding Independent Pitching (xFIP) is an early 2000’s tool that adjusts a pitcher’s ERA for average luck on balls in play and adjusts for expectations in a given ballparks #CoorsField. A couple others to look up -- ERA+ and SIERA.
Hitting
Not gonna trash RBIs, not gonna trash RBIs, not gonna trash RBIs...It’s not that RBI is a stupid stat or has no correlation to being a good hitter - a lot of good hitters do get a lot of RBIs. The problem is some good hitters play on bad teams and hit a lot of doubles after two bums strikeout in front of them. Other guys can be pretty average (by professional standards) but end up with loads of RBIs thanks to nearly always hitting with guys on base. If you aren’t ready to embrace something you can’t figure out from a box score, On Base Plus Slugging (OPS) is a pretty good measure of someone’s overall hitting prowess and straddles the line between new and old school statistics. Newer stats like Weighted Runs Created Plus (wRC+) try to parse out the difference. Weighted Runs Created quantifies a player’s entire offensive output in the form of runs, which is then adjusted for league average and park effects (this is going to be a trend). An average hitter has a wRC+ of 100, while a Mike Trout throws down a 169 for his career (meaning he is worth 69 runs more than an average player over the course of a season) and Dansby Swanson was good for a 66 last year (Go Braves!). Others to look up are - ISO and OPS+.
Defense
Many of the offensive and pitching stats have outstanding predictability and minimal
variation. This is not the case for defensive analytics, though they still tower over errors and fielding percentage when it comes to usefulness. Both Defensive Runs Saved (DRS) and Ultimate Zone Rating (UZR) measure individual player defense by attaching a “runs” value. This approach makes sense, and the ways in which this is employed should look pretty logical to most baseball fans. UZR incorporates runs saved by outfield arms, double plays, player range, and errors. DRS is similar and expands to include robbed HRs, stolen bases avoided by pitchers/catchers, and 1B/3B bunt defense. These little things that managers and fans appreciate that don’t show up in a box score are quantized and put in a box score. The difficulty in defensive analytics is keeping track of the angle, trajectory, and velocity of each batted ball and determining the expected zone a player should cover. This is done slightly differently by different teams, etc. so not everyone’s numbers are exactly identical. Like a lot of traditional stats, it’s not that errors aren’t bad for your team but that there are much better statistics to describe someone's defensive prowess (Andrelton Simmons) or lack thereof (Derek Jeter).
Overall
Both hitting and defensive measurements tend to think of production in terms of runs, with hits creating runs and defensive plays saving runs. Runs in turn can be related to wins (more runs scored is more wins and less runs given up is more wins). Wins Above Replacement (WAR) and Value Over Replacement Player (VORP) are two similar ways of doing this. WAR has become increasingly popular and is now listed front and center when you pull up a player’s stats on ESPN.
WAR works off the idea that roughly 10 runs created/saved is worth 1 win, but with the inclusion of defensive stats WAR is not the most accurate of advanced stats. If Player A has a WAR of 4 and Player B has a WAR of 5, the conclusion is more along the lines of “these are comparably good players” rather than “Player B is superior.” Where WAR really shines is over the course of years or an entire career, and it gets brought up a lot when discussing players HOF credentials or lack thereof.
Team
As pitchers and catchers report, hope literally(ish) springs eternal. If you’re like me, you look at a few sources to see predicted records for every team (and maybe stop scrolling after you see your team). So, if you are tired of reading predictions from the sports journalist that definitely hates your team while maintaining unabashed love for your rivals, try out Player Empirical Comparison and Optimization Test Algorithm (PECOTA). PECOTA looks at how good each individual player is and how much each player will play. This turns into a projection of team runs scored and allowed, which then relates to projected team wins and losses. This approach is almost always better than Joe Sportswriter’s take on how each team’s season will play out. PECOTA isn’t perfect - recent history has underestimated teams like Baltimore and Kansas City who have leaned heavily on outstanding bullpens and relatively de-emphasizing starting pitching (by deviating from the expected playing time for starters and relievers.). And even when the projections avoid the systemic issues you always have randomness in injuries and teams/players that experience growing pains or take a big unexpected leap in a given season. Scoffing at preseason prognostications is a time honored tradition, so feel free to continue this hallowed ritual with PECOTA. Just know, it’s probably a better prediction than the optimistic ones you and your friends have about your team this year...
TL;DR
All stats aren’t created equal, and advanced stats are often better than traditional method
s for determination of past performance and insight into future production. Everyone says they want to win, but if you are ignoring analytics you are starting with a disadvantage. And remember, advanced stat X is a conspiracy that definitely hates your favorite team/player Y!
Comments