Categories
Uncategorized

Finding the NBA Player For Each Role

Inspiration:

Recently I was talking with a friend about the Dallas Mavs and we were talking about why we did not see them coming this post season. What stood out is how Dallas was never considered a contender heading into this season and that was because it never felt like they had a complete team. Of course they made a mid-season trade that added another ball handler and Jalen Brunson has shown that he is a really good basketball player but once again heading into next season I do not think the Mavs will be considered real contenders. While Dwight Powell is a solid dependable center, my friend and I thought that someone more assertive on the glass was what fit the Mavs and elevated them into contenders. My friend and I just threw out names of guys who might be good fits, players who can set solid screens, catch lobs, and get rebounds. This conversation later inspired me to use a clustering algorithm real quick to just see if I can bucket NBA players into different play styles so that the next time we have such a conversation we have an already ready list of players to discuss…

Results:

After doing some analysis, I found that there were 23 worthy clusters and here is a neat interface for you to explore them below:

You can also access the data here for a better experience: https://public.flourish.studio/visualisation/10078372/

Analysis:

The work above is more geared towards casual analysis of basketball since the clusters are derived from box score stats. I think the value of this short project lies more in having this as a talking point stat for the purpose of enjoyment. However, this approach can be used to also produce data valuable to teams through the use of tracking data which shows detailed data points on how the player impacts the game. In other words, a more practical incorporation of this project is done by using data points like box-outs, screen assists, and shot location. All these pieces of data are on the NBA website and while it is tedious to scrape them, they can be scraped but in truth going through that whole process would be overkill (for the purpose I needed) due to the massive amounts of data that needs to be processed across different sources so for now this is what I have come up with.

My Code:

The code below is entirely reproducible*, it’s as easy as hitting run on whatever IDE you use or just pasting the code into Google Colab.

*except for one thing, clusters may slightly vary since I forgot to set the random seed but the differences should be very small. If you are not familiar with this stuff then it won’t really matter

Unfortunately I am on the WordPress free version so I can not edit the HTML, so here is a redirect link to the GitHub w the code.

Link: https://gist.github.com/SalimAlkharsa/afbb2bfc7628b8622b0090a32c944c91

Leave a comment