Peeking Under the Hood of an NFL Expected Points Model

visualizations
NFL
sports analytics
Published

July 28, 2023

Photo by Jason Buscema on Unsplash

I was looking for an excuse to make some data visualizations to test how the interactive features of Plotly figures work on Quarto. And, given that football season fast approaches, why not illustrate what goes into (and comes out of) an NFL expected points model?

I’ll use an established model predicting the expected points given the specific circumstances of a play before in starts. Those circumstances were linked to historical data on what the next scoring (e.g. possessing team TD, opposing team FG, end of half) ended up being, with the expected points essentially being the sum of the points from each outcome multiplied by the estimated probability of it happening. It was made available by the prolific and tireless producers of the nflverse R package. As support, they’ve provided ample documentation, and even a handy function to quickly output the expected points given all the terms. I also think that NFELO provides a nice overview of how it’s used, so you can check them out.

Below, I walk through a number of predictors in the model, holding others steady, to get an impression of how each impacts the model predictions.

PLAY-LEVEL FACTORS

Game Time and Field Position

Figure 1. Heatmap

Figure 2. Surface Plot

Figure 3. Line plot

Cursory Observation
  • Generally, expected points don’t drop off precipitously until very late in the half
  • Expected points for the road team remain fairly stable over the majority of the half, but are higher and decline slightly over time for the home team

Down

Figure 4. Dumbbell plot

Cursory Observation
  • Down-to-down differences in expected points are fairly stable no matter where on the field you compare

Timeouts

Figure 5. Heatmap

Cursory Observations
  • Early timeout usage for the defense increases expected points for the offense earlier in the game, but tends to reduce them in late game situations
  • An offense with few timeouts late in the game is predicted to be more likely to score
  • Timeouts could be a stand-in for game script, with teams using timeouts being a sign they’re actively trying to score, especially at the end of the game if they’re still in contention

GAME-LEVEL FACTORS

Eras

Figure 6. Line plot

Cursory Observations
  • Older seasons are predicted to have slightly lower EP
  • Season differences in EP are more pronounced for away team possessions

Stadium Roof

Figure 7. Lollipop plot

Cursory Observation
  • Outdoor games reduce EP for the possessing team, dome games increase them

That’s a tour of what I saw when taking slices of the model in different directions. If you have some feedback or questions, you can certainly send them my way. My contact info should be in the upper right corner. Hope you enjoyed!