*Nicholas Canova*

In our first post, we introduced this year’s UNC Basketball Analytics Summit case competition and began by classifying NBA players as superstars and busts based on their first 4 years performance in the NBA, as well as assessing net win shares (net WS) for each drafted player. In this second post, we begin by discussing our clustering of NCAA teams by play-types, and move to analyzing play-types further for trends across each position. We believe these to be our most interesting analyses, and this post will likely be a few paragraphs longer than our first and third posts. We will do our best to keep the longer post interesting.

Likely the most important question we had to ask and answer throughout the contest was “How should we quantitatively group NCAA teams into systems?” Since the case question specifically asked about certain types of systems, however left to us how to define on our own what exactly a system is, we thought long on this and came up with three strong possibilities:

- Could we cluster teams by the general offensive strategy they use? For example, does Duke primarily run a triangle offense, motion offense, Princeton offense, pick and roll offense, etc.? What about UNC, Kentucky and Gonzaga? What about every small-conference D-I school?
- Could we cluster teams by looking at teams’ coaches? NCAA coaching turnover is much lower than NBA coaching turnover, and if certain NCAA coaches are more likely to run the same system each year, this may be useful for clustering.
- Could we cluster teams by the play-types a team runs most frequently? Is there play-type data, and if we could obtain it, could we see which teams run certain plays more or less frequently than other teams?

We considered the first option as too subjective of an analysis. Given that we needed to classify both current as well as historical NCAA teams, we considered this to be an unreasonable and likely inaccurate approach. We also considered the second option as highly subjective, as well as too incomplete. Grouping similar coaches by coaching style leaves much to an eye test and little to a more quantitative analysis of the offenses strategy. This left the third option, a clustering of teams by the frequency with which they ran each type of play. Using play-by-play data from Synergy Sports from 2006 – 2015, we were able to pull the percentage of plays of each of the 11 offensive play-types (see below for the different play-types) for each NCAA team for each season. We then wrote a k-nearest neighbors clustering algorithm that treated each team-season’s breakdown of play-types ran as an 11-dimensional vector and separated teams into 8 clusters based on the euclidian difference of these play-type vectors. All this means is that teams that ran similar plays at a similar frequency are grouped into the same cluster, which is much simpler than my previous sentence.

The set of 11 tables above summarizes the results from our initial clustering. Each table represents one of the 11 play-types, and each of the 8 bars within each table represents the percentage of that play ran by teams in that cluster. For example, looking below at the 11^{th} table for the spot up play-type, we see that teams in the 5^{th} cluster ran close to 35% of their plays as spot-up plays, whereas teams in the 6^{th} cluster ran less than 20% of their plays as spot-up plays.

With this clustering of teams, we could then ask ourselves what types of plays are being run more or less frequently by systems that are generating star and bust players. The table below summarizes our initial findings, and shows that clusters 4, 6, and 7 generated the best ratios of stars to busts and also had the highest net WS per player, whereas clusters 5 and 8 performed poorly. The descriptions column attempts to give a play-type description of what differentiates each cluster the most. Looking at the 7^{th} cluster, whose teams ran a higher percentage of isolation plays and was otherwise fairly balanced, we see that this cluster included 59 teams that sent at least 1 player to the NBA, 9 players of which became stars and 6 of which became busts based on our earlier criteria, and whose drafted players on average outperformed their draft position expected WS by 1.912 per player across the players drafted from those 59 teams.

In terms of net WS per player, 2 of the 3 strongest performing clusters feature offenses that emphasize isolation plays, whereas both of the 2 weakest performing clusters de-emphasize isolation plays. Further, the strongest cluster de-emphasizes spot up shooting whereas the weakest cluster emphasizes spot up shooting. We leave to you to compare further this table and the play-type graphs to reveal other patterns of over- and under-performance of certain clusters of teams by play-types.

Extending this sort of analysis, we next took a look at the offensive tendencies of those systems that superstars and busts came from, at each position on the court. That is to say, we expect that teams with very good players at specific positions would lean their offensive strategies more towards play-types featuring these players. Wouldn’t NCAA teams with elite centers run more post-up plays? Do teams with elite point guards push the ball more in transition? The graphs below answer these questions, with interpretation of the graphs as follows – there are 5 graphs, 1 for each position. Each graph features the 11 play-types shown earlier, and for each play-type both a red bar that displays whether the NCAA teams of players that became NBA stars at that position ran a higher or lower percentage of each play-type than the offenses of players that were drafted but did not become NBA stars at that position, and a blue bar that displays whether the NCAA teams of players that became NBA busts at that position ran a higher or lower percentage of each play-type than the offenses of players that were drafted but did not become NBA busts at that position… these graphs are a bit difficult to explain and can be difficult to draw insights from, so maybe read that last sentence again, and let’s look at the graphs to understand more.

Looking at the bottom graph, on point guards, we see that NCAA teams whose point guard was drafted and became an NBA star ran transition plays roughly 18% more frequently than did NCAA teams whose point guard was drafted but did not become an NBA star. Alternatively, NCAA teams whose point guard was drafted and became an NBA bust ran transition plays 33% less frequently than did NCAA teams whose point guard was drafted but did not become an NBA bust. This makes sense intuitively, as teams with star point guards should be more willing to push the ball in transition, trusting their talented point guard to make good decisions with the ball. The first graph, on power forwards, makes intuitive sense too, where we see the teams with star power forwards run fewer spot up shooting plays (not typically a play featuring the power forward in college) and more post up plays. Again, we leave to you to dig more nuggets of insight from the graphs and make connections with what plays we would expect a team to favor given stars at certain positions.

With this, we wrap up the second post, which I hope was as interesting for you to read as it was for me to type out. Our third post will follow shortly, with our last analyses and concluding thoughts on the competition.