NBA Fantasy analytics with Python on EPSN

How to prepare your fantasy team on ESPN NBA Fantasy.

Three years ago I got into the realm of NBA fantasy with the help of a friend. I love basketball, but I used to follow mostly the European competitions, e.g. the Euroleague. Although I was very familiar with the NBA stars, I lacked the depth of knowledge required for competing in a NBA fantasy league. I teamed up with a friend, who is an NBA expert. I thought I could learn a great deal about the second unit of NBA players, those that win head-to-head fantasy match-ups.

We entered a league with other friends on the ESPN platform. Our league is a head-to-head league based on nine basic statistical categories: FG%, FT%, number of three-pointers, points, rebounds, assists, blocks, steals and turnovers. The rules are simple. Each fantasy team consists of a selection of NBA players, usually eleven players. At each round (it usually lasts for a week), a fantasy team faces another fantasy team. Each team collects points in each statistical category based on its NBA players’ performance. For example, the team’s points are the aggregate of the points of the team’s NBA players as they play their actual games. Similarly for the rest statistical categories. Every day, only eight players (or nine depending on the league options) are allowed to collect points, the rest are sat on the bench. At the end of the round, the fantasy team with the highest score in a statistical category (or the lowest if the category is the turnovers) wins the category.

After a couple of rounds, it became clear that the NBA schedule plays an important role in the fantasy match-up. The more games played during the week, the more likely for the players in the fantasy team it is to score more points, grab more rebounds, etc.

Also, after the end of the round, there was always the question, did we lose because of bad luck? Or did we win because of an advantageous schedule? Should we replace a player, even if we won, or should we keep the team unchanged even if we lost.

I started answering these questions manually, looking into statistics, browsing the NBA schedule, comparing stats between different teams. But, in the age of automation, I decided to automate this type of repetitive task.

I started with the very cryptic ESPN fantasy API. There is no official documentation, but this blog is a very good starting point. After familiarising myself with the endpoints and the structure, I started developing the method required for answering my questions.

The project has been developed in Python and can be accessed on github. It basically consists of two notebooks, one for analysing the upcoming head-to-head matchup and the other for analysing a completed round.

Head to Head Analysis

Prior to a head-to-head matchup, a fantasy player needs to know the following details:

  • How many games each fantasy team is playing. We need to get as many players as possible on the court during the round.

The number of games is an important piece of information, as many categories are aggregate totals of raw statistics, such as rebounds, assists, points, etc., hence the more the games, the merrier. In addition, players who do not fit in the roster and are forced to stay on the bench in a given day are a waste of resources, the user should consider making changes to optimise the schedule of his/her players during the matchup.

The number of games per day during the round is given in the following form. We add up the total number of valid players and have that Team A has 38 matches during the round against 41 of Team B.

Number of players playing each day for Team A and Team B. We count the valid players and also the unused substitutions for each team.

Furthermore, we project the matchups outcome using two methods.

Naïve Projection
A naïve projection method, which is based on each fantasy team’s cumulative statistics throughout the season. However, this method has the following drawbacks. First, the current players in a fantasy team might have changed considerably compared to the players that achieved these statistics, due to trades or weekly roster changes. Second, the NBA schedule plays a crucial role in the outcome of a head-to-head matchup. More games implies more chances to get rebounds, assists, points, etc. This naïve method does not consider the fantasy teams’ current roster or schedule. The output looks as follows

Output of the naïve projection method

Simulation
We run a simulation which replicates 100,000 realisations (repetitions) of the matchup by sampling from the players’ mean statistics and following their schedule throughout the week.

The raw statistical categories, such as rebounds, assists, points, etc. are sampled from a Poisson distribution with mean the player’s seasonal mean of each statistical category. For the FG% and FT% we sample from a gaussian distribution with mean the mean FG% and FT% respectively and standard deviation 0.2 times the mean.

We sample for each player in the roster and each game in the schedule during the period of the matchup (usually a week). Injured players are ignored throughout the matchup and day-to-day players are considered as normal. We simulate for both teams. Next, we aggregate the results at the end of the period (taking the sum for the raw statistical categories and the mean for the % ones). We repeat the sampling 100,000 times to ensure that the results are statistically significant.

Finally, we estimate the chance of a team to win each statistical category by taking the ratio of the winning repetitions over the total number of repetitions. For example, if the home team wins the rebounds category 60,000 times out of 100,000, its probability for winning this category is 0.6 (or chance 60%) (hence the probability of the away team to win is 0.4, or chance 40%). The output of this method is shown below.

Winning probability for each category for the home team.

Post Round Analysis

Very often, after a fantasy round, players would like to know how they did compared to the entire league. A fantasy team might have lost a matchup, but this could have happened for a number of reasons:

  • Good team, but the opponent was marginal better

Also, how would a fantasy team perform compared to all other teams that week?

To answer the above questions, in the second notebook, we perform the following pieces of analyses.

  1. Aggregate statistical categories for the week for each fantasy team, including total minutes and games played. The minutes and games played are important to gauge whether a team’s performance in a category, say rebounds, is due to too many or too few games and minutes played by the players in the roster in that round (on average, the more games played during the round, the more chances to score more points, grab more rebounds, give more assists, etc.).

The output to each task is shown below.

Statistical categories, included total minutes and games played for all teams for a past round.
Index of the statistical categories above. A team that ranked first in, say points, has index 1 for points.
All possible head to head matchups for a past round. Home team is in the rows, away teams is in the columns. If the home team won 6–3, then the score is 3, if it lost 2–7, the score is -5. Tie scores like 4–4–1 or 3–3–2 are shown as 0.
Number of matchups won by each team in a given round. This is a 18-team league. Second column shows the percentage of matchups won, and the third column shows the average score difference from all matchups.

Conclusion

With the above tools, we can explore the potential of our team, predict the outcome of an upcoming matchup and make the necessary changes in the line-up to optimise our team’s performance.

These tools are for exploration. Domain knowledge is also equally important. Knowing if the starter of an NBA team is injured, hence the benched player will take more minutes on the floor, is of extreme importance. Combining the two can result to a successful fantasy team.

There are many more things to be done. A straightforward question is what happens to the projected statistics if we replace a player with another player. Or, in the last day of a round, should we invest on player that dominates on statistics A or statistics B to win the matchup? These questions are still work in progress.

Data Scientist, DPhil (PhD) in Theoretical Physics, BSAC diver, evolution, raspberry pi, drone & LEGO enthusiast, fan of basketball and analytics