Loading

Is Yankee Stadium a Little League Ballpark?

Introduction

As a diehard Yankees fan, the 2022 season has been the one Yankees fans have wanted since 2008. As of this writing, they have the best record in baseball (by 5 games) at 52 wins and 22 losses, and are ahead of their division rival the 2nd place Red Sox by 13 games. On pace for one of the greatest seasons in MLB history, they also have the best player in the league in Aaron Judge, and the best pitching staff led by Gerrit Cole, (my personal favorite) Nestor Cortez, and Clay Holmes.

However haters gon’ hate, and one of those haters is the manager of the Texas Rangers (currently at 37W - 41L), Chris Woodward. After Gleyber Torres hit a walk off home run to win a game at home against the Rangers on May 8th, Woodward told reporters that the Torres’ hit “is an easy out in 99% of ballparks. ... He just happened to hit it in a Little League ballpark...”. Woodward is referencing the short right field at Yankee stadium, over which Gleyber hit his home run. He later walked back his comments, maybe because Statcast projections showed that the home run would’ve been a home run in 25 ballparks, or maybe because Yankees manager Aaron Boone pointed out in a press appearance that his 99% figure wasn’t mathematically sound, since there are only 30 Major League teams.

The short porch at Yankee stadium is still a topic of heavy debate. Sitting at just 314 feet down the right field line and staying relatively shallow towards center field, it offers an easy target for lefties who can pull the ball (such as Anthony Rizzo), or righties who have a swing that favors power in the opposite direction (like Judge or Giancarlo Stanton). This results in a few home runs that probably shouldn’t be, like this home run by Rizzo just a week ago, that statcast data shows would only be a home run in Yankee stadium.

In an effort to defend my team, I wanted to quantify the comparative effect of ballpark boundaries on hitting home runs. I use statcast data from the 2021 season, and rely heavily on the math and ballpark data compiled by Dan Morse (danmorse34 on github), owner of the twitter famous “would it dong” bot that projects the number of ballparks a given home run would be a home run in.

The Data and Methodology

We use the python library PyBaseball to import scraped statcast data from every at bat in the 2021 season. The statcast data gives us launch speed, launch angle, direction, and projected travel distance; everything we need to predict home run outcomes in other parks. For every ball hit over 300 feet, we check and see if it is a projected home run across the 30 ballparks. We then tally up the total number of home runs and rank the relative parks.

An Aside on the Physics

I rely on the formulas that danmorse34 uses in the dinger machine calculations, however when running into some imprecision in my results, I decided to dive deeper into his methodology. In basic projectile physics, you learn that the vertical and horizontal components of velocity act separately. A formula with vertical velocity and vertical acceleration (gravity) gives you the flight time, and a formula with horizontal velocity, horizontal acceleration, and flight time gives you your flight distance.

We don’t have horizontal acceleration, but since we get an estimated distance value from statcast, we can calculate it and assume it holds constant. Using all of this, we calculate the time it would take to reach the nearest fence in the ballpark we are projecting in, calculate the height of the ball at that time, and check to see whether the ball is higher than the wall to determine if it is a home run.

Results

Below are our results from running the simulations on the statcast data. HRs per game is the average number of home runs a game if every team played all 162 games at that given stadium.

Some notable standouts (on the low end) are the Kansas City Royals, the Arizona Diamondbacks, and the Colorado Rockies. The Royals have one of the deepest outfields in baseball, with 330 ft. down the lines quickly becoming 380 ft., giving it the 2nd largest outfield in baseball (according to ballparkpal.com). The Rockies play at a mile above sea level (something our model doesn’t account for), meaning they have further walls to compensate for lower wind resistance. In fact, stats show that Coors field has the highest HR park factor of any team. The Diamondbacks similarly have unique weather factors that our model doesn’t account for, being in a dry, warm, relatively high altitude climate means the balls fly further, so the stadium has deeper walls to compensate.

On the other end of the spectrum, the Cincinnati Reds, the Los Angeles Dodgers, and the Seattle Mariners all sit over one standard deviation above the mean for HRs per game. The “Great American Ballpark” of the Reds has the 2nd highest HR park factor in the MLB (according to ESPN), with below average distances to most walls in the park making it a hitter friendly stadium. Dodger stadium has one of the shortest center fields in baseball at just 395 feet, which combined with the 4 foot walls at the corners make it also hitter friendly. The Mariners park is the 5th smallest in baseball, to account for it’s MLB worst flight times due to cold temperatures, high humidity, and low elevation in Seattle.

As for Yankee stadium, we see that it is middle of the pack in terms of hrs per game, sitting just a little above average. It’s short porch in right is offset by it’s incredibly deep left field, and is also measured at having the second lowest carry distance (behind only the Mariners). It is also 17th in HR park factor, further demonstrating that Yankee stadium is not, in fact, a little league stadium.

Assumptions and Caveats

We have to make a lot of assumptions to reach our final results, both in the quality of our data and in the physics we use in our projections.

  • The biggest assumption our model makes is that balls fly the same in each ballpark and from at bat to at bat (which we know they don’t). Different factors like elevation, humidity, temperature, and wind change from location to location, and even at one ball park throughout the season.
  • We validated our method for projecting home runs by checking to see if our model agreed with the actual outcome of the at bat, however we identified having error rates anywhere from 1% to 22%, depending on the ballpark. This can be due to a variety of factors, from bad statcast data to bad measurements on our fences data, however we correct for the error rates in our final totals.
  • As mentioned above, we assume that the statcast data is accurate (particularly distance), however it is often not (as pointed out by some Reddit users). Statcast uses a more advance model than we do to project distance, however any imprecision on distances for close home runs / long fly balls can create imprecision in our projections.
  • We assume an even distribution of home runs across stadiums, which wouldn’t impact the ranking of the parks (or the standardized values), but would impact the relative magnitude of the avg. home run value.
  • We also assume an even distribution of spray angle across the season / parks. Intuition says that more balls are hard hit to the left side than the right, which would hide / exaggerate the effect of asymmetrical ballparks.
  • The last big assumption is that we make is that hitter strategy / lineups are irrespective of ballparks. This is obviously not true, we know that hitters will adjust to the ballpark, and managers will play batters whose ball dispersion plays particularly well to a given stadium.

The most accurate way to interpret these results is the relative impact of ballpark dimensions, irrespective of location. All else equal (wind, humidity, elevation, ect.), this is how each park would stack up. A statistic that accounts (implicitly) for climate is park factor, which looks at the relative occurrence of stats when at home and away. However, this statistic can suffer in it’s accuracy from random variations in team performance, in-congruencies in lineups / pitching rotations throughout the season, and the comfort level of specific teams at home and on the road. Park factor is undoubtedly more accurate in real world applications, however our calculations show that Yankee stadium is no where near the worst when it comes to dimensions.

More Mosts

You might Also Like