I Turned Every NBA Team Into A Faction

Introduction

I have been a fan of the Warriors for as long as I can remember so seeing how poorly they have performed as a visiting team during the '22-23 season was definitely a spectacle to behold. As a result, I started thinking that perhaps this could be due to all the traveling that this team has to do and maybe that has something to do with the way they perform during away games. Originally, I wanted to see if there was any correlation between longer road trips and the amount of wins that a team gets. However, after creating a map of all of the arenas (as shown later), I began to ponder a different question. Given 30 teams, how could we arrange them so that each arena is located within a region that occupies the same area.

Using Basketball Reference, I scraped through the stats for every team in the NBA starting from the 2009-2010 season and ending at the 2021-2022 season. In addition to this, I also created a seperate database that houses information for every teams arena and physical location by collecting the longitude and latitude coordinates of each arena. This will help us when we start analyzing distance travelled for each time. To conclude, I then joined both of these dataframes in R to create one large dataset with detailed information. The dataframes can be found at the following GitHub link

Calculating Distance

As mentioned before, I went ahead and recorded which arenas each team has played as well as the coordinates of each arena. This will be helpful for any geospatial analysis we may want to do. The following is a map of each arena that was recorded.


Without doing any analysis, I think it's very interesting to see where each arena is located. A vast majority of the arenas seem to be found near the perimeter of the United States. This is not the focus of this analysis, but I just thought it would be interesting to point out.

With the coordinates for each arena, we can start doing some interesting analysis. The first thing I decided to do was to split up the dataset to only keep track of each away game from each team. The reason for this is to avoid double counting games. For example, a game in which the Golden State Warriors host the Los Angeles Lakers would be the same as saying the Los Angeles Lakers visited the Golden State Warriors. The dataset would of course have each game recorded, and since we only care about traveling teams, that means we can remove the Home games from our dataset.

NOTE: It was at this point that I began getting curious about creating equidstant teams and the project begins to deviate.
Now that we have a curated dataframe, we can now begin calculating distances between arenas. However, it is not as simple as subtracting values from one another. The reason for these complications is due to the fact that we live on Earth, a surface that is not flat (groundbreaking analysis, I know). The curavture of the Earth makes it difficult to do calculations. There is actually a lot of interesting theory and calculus that goes into finding the shortest distance between two points on a sphere. That is beyond the scope of this article but a good place to start is to search up Geodesic distances. Fortunately for us, we don't have to do any complicated math, we can simply use an R package to calculate the distances between two coordinates, such as the Geosphere package.

We can now begin with some simple analysis. The first thing I did was to take the mean of the distance travelled for each team. This is just a simple way to see which teams have travelled the most in the past 10 years or so. We get the following results:

Team Distance (km)
POR 2761.0815
GSW 2531.3617
SAC 2432.9807
. . . . . .
CHO 949.5470
TOR 945.4340
CLE 884.7197


Already we already seeing some interesting results. First of all, the three teams that have to travel the most are all located on the West Coast. In fact, all five teams located on the west coast occupy the top 5. Los Angeles Lakers and the Los Angeles Clippers both share the same arena and are both tied at 2211.8329 kilometers. Something that I found interesting was the fact that the Toronto Raptors were ranked 29 out of 30. This means that they travel the least compared to every other team except for the Cleveland Cavaliers. The reason I find this interesting is because the Toronto Raptors are located in Toronto -- outside of the United States. Again, this makes logical sense when you take a look at the map once again. There defintely appear to be more teams on the Eastern side of the United States compared to the Western side and looking at where Toronto is located, we can see that there are a number of teams located within a close vacinity.

Equidstance

The Coastline Paradox

If we want to establish where in the United States we can place 30 NBA teams, we have to start thinking about geometry. If the United States were a simple rectangle, this wouldn't be a difficult task. However, this is not our reality. The United States has a very complicated perimeter which is going to make these calculation a little bit tricky. This poses the question, how do we calculate the perimeter of the United States? This may sound like a straightforward answer but there are actually a lot of nuances that go alongside this problem. I want to take this opportunity to talk about the Coastline Paradox.

Norwegian Sea Map


The coastline paradox is this phenomena that basically states that is impossible to measure the distance that a coastline covers. What does this mean? Above we can see a map of the Norwegian Sea and the surrounding countries. Taking a look at the coastlines, we can definitely see the complexity of the shapes of said coastlines. How would we even go about measuring them?

The simplest way would be to take a ruler and go alongside the coast and take measurements as you proceed. Seems like a pretty straightforward answer. The issue comes when trying to pick an appropriate scale and method to do these calculations. Depeneding on what unit we decide to take our measurements in, we are going to get a different problem. More specifically, the smaller our units are, the larger our perimeter gets. In fact, as our units of length approach 0, the length of the coastline will approach infinity. A visualization is shown below.

Norwegian Sea Map

There's actually a reason for why I chose the Norwegian Sea and its surrounding coastlines as an example. Due to how complicated the coastline of Norway is with all of it's dips and crevices, it actually has the second largest coastline in the world. It falls behind only to Canada.

There are actually a ton of really cool resources out there regarding this paradox that I wish I could talk about. The math behind it is astonishing and actually involves infinitesimals and fractals. I highly recommend researching this topic. Anyways, the reason I even bring this up is because we are going to need to create some sort of consistent way to meausre the perimeter of the United States. From there, we can start getting into geometric math to solve our original question.

Calculating Borders

So how exactly are we going to go about dividing the country up into equal parts, when there's so much amiguity involved?