Success index of NBA teams using PLS Path Modelling

There are a few ways to access how good a team is. In this post I use Partial Least Squares Path Modelling to construct a team success index for NBA teams during the 2015-16 Playoffs.

PLS-PM can be described as a tool to analyse the relationship between blocks of variables, taking into account some previous knowledge about the phenomenon observed. In most competitive team sports like basketball success depends on a large number of variables. However, most of these variables can be grouped under two major blocks: offense and defense.

Continue reading

Visualising the 2015 NBA Draft in R

This visualisation in R displays the origins and destinations of players participating in the 2015 NBA Draft using Sankey diagrams.

A Sankey diagram is a visualisation used to depict a flow from one set of values to another. The things being connected are called nodes and the connections are called links. Sankeys are best used when you want to show a many-to-many mapping between two domains (e.g., organisation type and organisation name) or multiple paths through a set of stages (colleges and NBA teams).

Continue reading

What AFL stadiums pull the biggest crowds?

I found a fantastic dataset on Australian Rules Football (or Australian Football League – AFL). The AFL Machine Learning Competition, promoted by Sportsbet, provided statistics on every match played since the year 2000. A good piece of information in this dataset is the registered attendance of each match.

So I decided to plot the attendance using ggplot2 boxplots.

Continue reading

Visualising NBA shot charts in Tableau

In my last post I produced some NBA shot charts in R using data scraped from stats.nba.com and ggplot2. This time I extracted all shot location data available for 490 players and linked it to a Tableau dashboard.

The first dashboard shows each shot attempted during the 2014-15 NBA Regular Season. On the right, it is possible to select team, player, shot type, shot zone and shot range. The table above the chart is also updated in line with the filter selection (click on the image to open the dashboard on a new window). Continue reading

How to create NBA shot charts in R

A while ago I found this fantastic post about NBA shot charts built in Python. Since my Python skills are quite basic I decided to reproduce such charts in R using data scraped from the internet and ggplot2.

Getting the Data

First we need the shot data from stats.nba.com. This blog post from Greg Reda does a great job explaining how to find the underlying API and extract data from a web app (in this case, stats.nba.com).

To get shot data for Stephen Curry we will use this url. The url shows the shots taken by Curry during the 2014-15 regular season in a JSON structure. Note also that Season, SeasonType and PlayerID are parameters in the url. Stephen Curry’s PlayerID is 201939. Continue reading

How much does a yard cost in the NFL?

NFL teams spend a lot of money on their key players, specially running backs whose objective is to conquer the most amount of rushing yards as possible. But how to measure the return on investment in running backs? Are teams getting the return expected? How much does a rushing yard cost in the NFL?

I plotted rushing yards for each running back during the regular NFL season of 2014 and their respective average yearly salary (the total contracted amount divided by contract duration). The result shows an expected positive correlation: the more money spent on running back salaries the more rushing yards a team is expected to gain. Running backs perceived as “good” tend to get more money, right? However this is not true for every team when we look at the big picture. Continue reading