BW #66: Pittsburgh
[Are you at PyCon US? Come to my talk on Friday, “Dates and times in Pandas,” and also to my booth, where I’ll be giving away T-shirts and stickers, and raffling off free copies of Pandas Workout. If you’re at the conference, please come and say “hi”!]
I'm writing this from Pittsburgh, where PyCon US 2024 is going on. I thus thought that it would be appropriate to explore some data about Pittsburgh this week. But what sort of data?
I found a data set listing all of the requests made to the city-information center via the 311 service. As you might know, 911 is the emergency phone number in the United States -- but for non-emergency issues, many cities have set up a similar service, at the phone number 311. The operators direct your call to the appropriate department, or pass your message along to it. We'll look at a data set of calls to Pittsburgh's 311 line through the year 2022.
Data and six questions
This week's questions are based on the 311 data. The home page for this data set is at
https://data.wprdc.org/dataset/311-data
This page includes links to the data and a data dictionary. It also says that the feed for this data was last working in December of 2022, and that they're working to restore it. So our data will only exist through 2022.
You can download the data from here:
https://tools.wprdc.org/downstream/76fda9d0-69be-4dd5-8108-0de7907fc5a4
This week's learning goals include working with time data, grouping (including the Grouper object), and plotting. As always, I'll be back tomorrow with my detailed solutions, as well as the Jupyter notebook I used to solve these problems.
- Read the 311 data into a data frame. Ensure that the "CREATED_ON" column is a datetime value.
- Create a stacked bar plot in which each bar represents the number of 311 requests in a given year. Each bar should show, via colorized sub-parts, the number of calls from each "REQUEST_ORIGIN". How do people submit most requests? Does this seems to be changing over time?