BW #86: FEMA

BW #86: FEMA

Hurricane Helene (https://en.wikipedia.org/wiki/Hurricane_Helene) caused massive of damage when it passed through the Southeastern United States in the last week, including massive flooding in parts of North Carolina. Hurricanes are large and deadly, and it's thus no surprise that President Joe Biden was in touch with the governors of several affected states to ensure that they got aid – including from FEMA, the Federal Emergency Management Agency (https://fema.gov).

The White House said yesterday that they had provided a great deal of food and water to people affected by Helene, as well as search-and-rescue teams and satellite-based Internet routers. How many disasters does FEMA deal with in a given year? Have those numbers changed over time? And where do they take place?

This week, we'll look at FEMA's history of disaster aid in the United States, analyzing data describing their work over the last few decades. It would seem that they've done a remarkable job over the years of helping people to get through a wide variety of problems that would overwhelm just about anyone.

Data and six questions

FEMA, like all US government agencies, publishes a great deal of data about its work. We can get a list of disasters on which it has worked, classified by type, date, and location (but not budget), from

https://www.fema.gov/openfema-data-page/disaster-declarations-summaries-v2

FEMA provides an API to query and download this data, but it also makes the raw data available in a variety of formats. Much to my pleasant surprise, this includes Parquet, a binary format used by the Apache Arrow project (https://arrow.apache.org/). You can find the Parquet file (along with other formats) in the "Full data" section of the data page. You can download FEMA's Parquet file itself from:

https://www.fema.gov/api/open/v2/DisasterDeclarationsSummaries.parquet

We'll also use some data classified by FEMA regions. Information about those regions is at:

https://www.fema.gov/about/organization/regions

This week, I have six tasks and questions for you. Among the learning goals are working with datetime data, pivot tables, plotting, working with string data, and performing joins. As usual, I'll be back tomorrow with my complete solutions, including a downloadable version of the Jupyter notebook I used to solve these problems.

Here are my questions for this week:

  • Create a data frame from the Parquet file. Make sure that all of the columns describing dates have a dtype of datetime.
  • Create a bar graph showing the number of disasters that FEMA has handled in each year, starting in 2000. (Use the incidentBeginDate as the source of the year.) The bar should be subdivided into colored sections indicating the types of disasters. Only include the 5 most common disaster types. Does any year stand out? How different do things look if we count every entry in the data set as a separate disaster, vs. every unique disasterNumber?