BW #77: Paris Olympics
[Reminder: If you haven't done so already, please fill out my course survey, at https://www.surveymonkey.com/r/2024-learn-survey ! I'll be announcing new Python and Pandas courses next week, and this is your chance to influence the topics that I'll teach. ]
The 2024 Olympic Games, hosted by France, started on July 26th, and will go through August 11th. Events are taking place all over France, and even in Tahiti (halfway between Australia and Peru), where the water is a bit more appropriate for surfing and related competitions. Even many non-sports followers watch or otherwise follow the Olympics – partly because they're a chance for a (largely) apolitical meeting of nations, partly because the athletes are just so good, and partly because of the heart-warming stories that we hear about the athletes.
This week, we'll look at some data coming from the Olympic games in France. Unlike most weeks, the data this week is changing as I write these words, and are guaranteed to change by the time you read them. If you get different answers than mine, and the games are still going on, then it's likely my data is already out of date.
Data and six questions
Where can you get data about the Olympic games? There are services that will sell it to you, but I always try to find freely available data sets. I found two that we'll use this week:
- The main data will come from Codante (https://codante.io/), a Brazilian company that has provided a free API to Olympic information. This is free, so it's limited to 100 requests/minute (which should be more than enough). If they feel that you're making too many requests, then they might block you IP address, so please be nice to them. The API is documented at https://docs.apis.codante.io/olympic-games-english .
- Geographic data about the Olympic venues come from Clockwork Micro (https://www.clockworkmicro.com/), which made shapefiles available in a GitHub repository at https://github.com/clockworkmicro/parisolympics2024 .
We'll also use the pycountry
package on PyPI (https://pypi.org/project/pycountry/).
I have six tasks and questions for you this week. The learning goals include retrieving data from APIs, working with data from different sources, grouping, applying functions to a data frame, and also some work with GeoPandas. I'll be back tomorrow with my detailed solutions, including the Jupyter notebook I used in my solutions.
- Using the API from
apis.codante.io
, download all of the per-country medal information. As of this writing, the country API has a total of five pages to download; you'll want to combine them into a single data frame. Set the index to be the 3-letter country ID. - What countries don't seem to have any continent? What's the deal with them?