Department of Transportation NBI Choropleth Map Visualization

NBI Iowa Visualization

Each year, the Department of Transportation collects and releases National Bridge Inspection (NBI) data, which includes all structurally deficient bridges and tunnels in the United States.

Recently, a friend from Grinnell College, Austin Frerick, asked for visualizations of Department of Transportation data for his home state, Iowa, where over 55,000 bridges or tunnels are in bad condition.

Iowa's NBI data is available at this link, and with a bit of manipulation, the text file can be visualized as a choropleth map or statistical thematic map.

This page walks through the Python code necessary to get your own choropleth map visualizations up and running.

1. NBI Data Manipulation

The initial text file from the Department of Transportation includes information about every bridge and tunnel in Iowa, but for the purposes of this walk-through, we're only interested in bridges and tunnels in poor condition. The Federal Highway Administration lists two criteria for determining "structural deficiency."

  1. A condition rating of 4 or less for one of the following:
    • Item 58 - Deck
    • Item 59 - Superstructures
    • Item 60 - Substructures
    • Item 62 - Culvert and Retaining Walls
  2. Or an appraisal rating of 2 or less for one of the following:
    • Item 67 - Structural Condition
    • Item 71 - Waterway Adequacy

We need to read the flat file with Python's data analysis toolkit Pandas and locate the structurally deficient bridges and tunnels.

import pandas as pd

df = pd.read_csv('IA16.txt')

df = df.loc[(df['DECK_COND_058'] <= 4) | (df['SUPERSTRUCTURE_COND_059'] <= 4) | (df['SUBSTRUCTURE_COND_060'] <= 4) | (df['CULVERT_COND_062'] <= 4) | (df['STRUCTURAL_EVAL_067'] <= 2) | (df['WATERWAY_EVAL_071'] <= 2)]

Additionally, we need to add a new column to the Pandas dataframe, concatenating the state and county codes for a full, five-digit county identifier. Pandas originally read the state and county codes as integers, so we will convert them to strings and use zfill to pad the county code.

df['STATE_CODE_001'] = df['STATE_CODE_001'].astype(str)
df['COUNTY_CODE_003'] = df['COUNTY_CODE_003'].astype(str)
df['COUNTY_CODE_003'] = df['COUNTY_CODE_003'].map(lambda x: x.zfill(3))

df['FULL_COUNTY_CODE'] = df['STATE_CODE_001'] + df['COUNTY_CODE_003']

For this visualization, we only need to count the number of structurally deficient bridges and tunnels in each county. Grouping the dataframe by the county identifier column will create a Pandas series of the sums, indexed by the county identifier.

size_series = df.groupby(['FULL_COUNTY_CODE']).size()

2. Visualizing the NBI Data as a Vector Image

Now, we're ready to visualize the data as a vector image, using this FlowingData tutorial as a guide. This map of the United States is a good starting point, because its five-digit county identifiers are accessible through the Python package BeautifulSoup.

Let's open the SVG with BeautifulSoup and set up a list of colors to highlight different counties. Counties with the least structurally deficient bridges and tunnels will be white, and counties with the most structurally deficient bridges and tunnels will be dark pink.

from bs4 import BeautifulSoup

svg = open('counties.svg', 'r').read()
soup = BeautifulSoup(svg, selfClosingTags=['defs','sodipodi:namedview'])
paths = soup.findAll('path')

colors = ['#F1EEF6', '#D4B9DA', '#C994C7', '#DF65B0', '#DD1C77']

As we iterate through each path, we'll check to see if the path's id is in Iowa. If it is not in Iowa, we'll remove the path from the vector image. If it is in Iowa, we'll check to see if the five-digit county code is in our Pandas series, and we'll edit the fill color.

for path in paths:
	# If the path id is in Iowa
	if path['id'][0:2] == '19':
		# Find the index for the fill color
		s = path['style'].find('fill:') + 5
		e = s + 7
		# If the path id is in the series index
		if path['id'] in list(size_series.index):
			# Convert value to pct rank
			size_rank = size_series.rank(pct=True)[path['id']]
			# Choose color based on pct rank
			if size_rank >= 0.8:
				color = colors[4]
			elif 0.6 <= size_rank < 0.8:
				color = colors[3]
			elif 0.4 <= size_rank < 0.6:
				color = colors[2]
			elif 0.2 <= size_rank < 0.4:
				color = colors[1]
				color = colors[0]
		# Otherwise, fill with the base
			color = colors[0]
		# Edit the fill color of the bs4 path
		path['style'] = path['style'][:s] + color + path['style'][e:]
	# If the path is not in Iowa, delete it

After saving the vector image, we'll have a rescalable visualization of the Iowa bridge and tunnel data.

NBI Iowa Visualization

3. Visualizing the NBI Data with Folium

While this vector image is useful for physical publications, it's not interactive. We can use the Python library Folium to visualize the data on an OpenStreetMap.

First, we will convert the Pandas series of NBI data into a Pandas dataframe with columns for the five-digit county identifier and the counts of structurally deficient bridges and tunnels. Then we will provide a path to a JSON file of Iowa counties for Folium. Each five-digit county identifier must exist in the new Pandas dataframe, so we'll append any missing identifiers.

df = pd.DataFrame({'GEO_ID':size_series.index, 'NBI':size_series.values})

import json

counties_geo = r'iowa_counties.json'
with open(counties_geo) as data_file:
	counties_data = json.load(data_file)

# Add each id to the df
for record in counties_data['features']:
	if record['id'] not in df['GEO_ID'].tolist():
		df = df.append({'GEO_ID':record['id'], 'NBI':0},

Finally, we'll import Folium and initialize a map object, centered on Iowa's longitude and lattitude. Folium will overlay the contents of our dataframe columns on a Leaflet.js map.

import folium

map1 = folium.Map(location=[41.87, -93.09], zoom_start=7)
map1.choropleth(geo_path=counties_geo, data_out='data3.json', data=df,
               columns=['GEO_ID', 'NBI'],
               fill_color='PuRd', line_opacity=0.3,
               legend_name='Structurally Deficient Bridges and Tunnels in Iowa')

The saved file contains the HTML and JavaScript you'll need to copy the map into your own website within an inline frame.

Next Steps

Combining Pandas, BeautifulSoup, and Folium with government data–like the Department of Transportation's NBI–is a great way to play with Python's data visualization packages. If you're interested in other libraries for choropleth maps in Python, check out Plotly, which is free but requires an API key, and Vincent, which builds Vega specifictions in Python.

Happy plotting!