Lies, Darn Lies, and... STATISTICS - Jackson County Library District

One tendency I’ve noticed in response to both the COVID pandemic and the election outcome is people wanting to look at raw data. It appears to me that there is a feeling that this unfiltered data is a better representation of the truth than anything else because it is, at the most granular level, a numerical quantification of what actually happened. When I’ve seen this work in a way that seemed antithetical to truth-telling, it was when the data was presented absent expert commentary or any explanatory wording, occasionally including the sourcing of the actual raw data. You can probably already see where I’m going with this… but to be clear: data is good. Data can tell a story… but data isn’t magic. Badly sourced or incorrect data will almost always get you to wrong conclusions. We saw this phenomenon with coverage related to the 2020 presidential election. Correct data without context can also result in incorrect conclusions and suboptimal decision-making, as we’ve seen with misinformation related to the pandemic. I’m going to break both of those situations down.

NOTE: We are building on some facts we set up in our last blog post. You can read the last post here. TL:DR? We are talking about the 2020 presidential election in the context of a historic event that happened, one that had a specific and well-documented outcome. We are talking about the current pandemic in the same context.

In the weeks after the election in November of last year, lots of viral social media posts popped up that contained incorrect data related to voter registration and ballot counts. The first one I remember seeing included a link to voter registration numbers that were from the 2018 mid-term election cycle and not the 2020 presidential election. So, the source of the data was good, but it was still, ya know, not relevant data to the discussion of the validity of the 2020 presidential election. You can find more about that story here. Subsequently, there were lots of stories questioning the ratio of votes to voters based on wrongly or inaccurately-sourced information. Both Reuters and Associated Press have debunked the argument that more votes were received than there were registered voters. You can find those links here and here. The conclusion to be drawn from this case is: make sure you understand where the numbers you are looking at are coming from. Generally speaking, when I see a number that is being shared on social media that “seems legit,” I try to track back the data source before sharing. I wish everyone would do the same. Just because a friend posted it and you trust this friend because “they are a pretty smart person” does not mean that your friend has checked their source. This is how posts with incorrect information end up going viral. Because people tend to like data that reinforces their beliefs and dislike data that contradicts them (ah, confirmation bias, my old nemesis), it can be hard to convince folks that the data they had accepted as true is actually incorrect. As a result, there are people who continue to believe that more people voted than were registered. We want to stop the spread of incorrect information here at the library. If you want more info about why we care and why it’s our job to care, you can find that blog post here.

During the early weeks of the pandemic, I saw raw data used by people who seemed to want to minimize the seriousness of the pandemic, either by focusing on trying to characterize the mortality rates as being lower than projected or in other ways minimizing the potential impact of the pandemic. The point seemed to be to justify a public policy position that would have involved doing nothing and letting COVID-19 spread unchecked through our communities. We only get to run the experiment once in real-life scenarios like this one, so we very much know what happens when we respond as we did last year, because we live in that world. We don’t know what would have happened if we’d done absolutely nothing and we don’t know what would have happened if we locked things down more tightly. But now, here we sit, with 550,000+ fewer people alive in our country due to the pandemic. There is something very democratic about questioning experts, and it feels like this tendency is a cultural norm. What I struggle with is how we resolve disagreement about these points if quoting an expert doesn’t have more weight than the opinion of my random internet friend from the last paragraph. How do we end up doing anything other than talking over each other until one of us gets tired, gives up, and goes home? I’m not sure, and also tired. That appears to have gotten us into a ridiculous confirmation bias loop, where experts draw conclusions that are unpopular with some, so that group finds another person drawing different conclusions from the same data set, but in alignment with conclusions they prefer, then expect those conclusions to be given equal weight to the conclusions of experts. Regardless of personal beliefs regarding public policy decisions, one thing has become clear: it is the case that a small percentage of an enormous number still turns out to be a very large number. There are certainly cognitive biases at play. The lesson I take from the early pandemic data conversation is that people with zero expertise in epidemiology will not draw as effective conclusions from that raw data as people with expertise in epidemiology will. Both groups will draw mistaken conclusions, because science is able to adjust as it gets better information and draw better conclusions… but discounting the experts in favor of some random YouTube video with a different take is not an effective way to source data.

If you aren’t sure where to start on tracking data back to its source or finding out what the correct number of, say, registered voters in the state of Wisconsin during the 2020 election, ask your local librarian. That’s our job… that’s what we are good at. We will get it for you and tell you where we found the number so that you can include it in your social media post. We hope it will go viral.

So: we’ve done where did the data come from?

I’ve got two more planned posts just on data:

Assessing infographics and graphs
What is being left out of what you are being shown, and does it matter?

Like I said at the outset, this could take a while. Thanks for continuing on this journey with me.

Yours in continued desire to crush mis/dis-information at its source: Kristin.