BW #43: Financial protection

Americans who are frustrated with a financial product or service can complain to the CFPB. How many complaints do they get a year, and from where? This week, we analyze their complaints database.

BW #43: Financial protection

Over the last few months, we've seen lots of high-profile cases of financial and business fraud, most notably FTX founder Sam Bankman-Fried (convicted), Binance founder Changpeng “CZ” Zhao (pled guilty), and former president Donald Trump (in a civil suit that's not looking too great for him).

The world of banking and investing is complex, and it's easy to convince people that not-very-safe investments are actually quite safe. Or that they should buy a product which is profitable for the bank or broker, but notfor the consumer. Or that the latest and coolest cryptocurrency will earn lots of money, when it probably won’t.

If you lose money on a foolish investment, or just let your money sit around, then that's on you, and is the nature of the free market. But if the company that sold you a financial product lied to you, or misrepresented what they were doing, or charged you fees that you shouldn't have had to pay ... then who can you talk to? Who is out there, making sure that the financial world is playing by the rules?

From ChatGPT: “Create a picture of a panda complaining to a bank manager about being swindled.”

In the US, there's an alphabet soup of government agencies setting rules and then enforcing them, including the FTC, SEC, and FDIC. But if you're a consumer and feel that you've gotten the short end of the stick on a financial product, to whom can you turn?

The Consumer Financial Protection Bureau (https://www.consumerfinance.gov/, https://en.wikipedia.org/wiki/Consumer_Financial_Protection_Bureau) tries to fill that role. Proposed by Senator Elizabeth Warren (https://en.wikipedia.org/wiki/Elizabeth_Warren) back when she was a law professor at Harvard, and established in 2011, people can tell the CFPB about problems they've had with financial products and services.

Matt Levine recently wrote in his (amazing, must-read) "Money Stuff" newsletter (https://www.bloomberg.com/account/newsletters/money-stuff) about a recent CFPB finding against Bank of America, who apparently falsified information on mortgage applications, for which they'll have to pay $12 million. (You can read more at https://www.consumerfinance.gov/about-us/newsroom/cfpb-orders-bank-of-america-to-pay-12-million-for-reporting-false-mortgage-data/ .) That made me wonder if the CFPB publishes data on how much it has retrieved in penalties from financial companies, and whether we could analyze it here in Bamboo Weekly.

The good news is that yes, the data is available. The bad news is that the downloadable data consists of a dollar amount for each year the CFPB has been around, which doesn't lend itself (no pun intended) to interesting analysis.

However, the CFPB does have a fully downloadable database of the consumer complaints it receives, along with the nature of the complaint, the company named in the complaint, and the like. That data is pretty juicy, and we'll be looking at it this week.

Data and 8 questions

This week, we'll be looking at the database of consumer complaints, available at the CFPB's site:

https://www.consumerfinance.gov/data-research/consumer-complaints/

You should download the entire database in CSV format from

https://files.consumerfinance.gov/ccdb/complaints.csv.zip

The zipfile is about 690 MB in size, and the unzipped file is about 3 GB in size, so make sure your computer has enough RAM to handle it! This week's learning goals include techniquest for working with large files, plotting, grouping, pivot tables, and time series.

I'll be back tomorrow with full solutions to these questions, including the Jupyter notebook in which I worked when composing them.

Here are this week's eight tasks and questions:

  • Read the CFPB complaint data. Read all columns, and don't specify any dtypes. How long did the read take? How much memory does the resulting data frame take?
  • Now specify categorical and date dtypes. How long did the read take? How much memory does this take?