Biased. The deceptive nature of data.

Published on May 20, 2021
Liviu Moroianu
Data & Research Manager

Fallacies of analyzing data, and how to avoid them

Data biases or statistical fallacies are interpretation and analysis mistakes that lead to inaccurate or faulty data. Let's explore 10 of my favorite data fallacies, look at real-life examples, and find out how you can avoid them.

False Causality - "Correlation is not causation." You surely heard this one over and over again. Events may occur together but that does not mean one caused the other. If you are not careful you may come to assume that playing basketball makes you tall.

Confirmation bias is seeking evidence confirming what we want to believe and dismissing all other evidence. Linked to this, the practice of cherry-picking is the practice of selecting only the results that are in line with your claims and objectives and excluding those that are not. This leads to dishonest results.

Sampling bias is drawing conclusions from data that is not representative of your target population. This may be attributed to both faulty data collection practices and human nature. The faulty collection comes from disregarding the sampling method. On the other hand, human nature does play a role as some of us are more available or inclined to participate in research (joiners, strong opinion holders, incentive seekers, helpers, or curious).

Survivorship Bias is using an incomplete set of data to draw your conclusions as that is the data that "survived", the data that you are left with. The most famous example for this is linked to World War II when engineers tasked with reinforcing the armor of fighter planes, were faced with gathering data from fighter planes that have returned from the battle to see where reinforcements are most needed. This data did not account for the planes that did not return and obviously most needed reinforcement.

The Cobra Effect is also known as a "perverse incentive" and occurs when an action or incentive backfires and produces unintended consequences, often the opposite result of what was meant to produce or solve in the first place. The term originated in the 1800s related to an occurrence that took place in India under British rule. The British administration, concerned about the rising number of venomous cobras, issued a reward for every killed cobra. Though successful in the beginning, this practice backfired when people started breeding cobras to collect the reward. When the government realized this, they stopped funding the program which led to thousands of now worthless cobras being released into the wild. Similar practices still happen today. COVID-19 plasma donation incentives were reported to be leading people to knowingly expose themselves to the disease to collect an incentive for donating plasma.

Gambler's Fallacy or the Monte Carlo Effect occurs when there is the belief that if something already happened more or less frequently than usual, tides are going to turn and the opposite is bound to happen next. It's like there's this invisible force that tends to even things out. The most famous example of this occurrence is from a roulette table in Monte Carlo in 1913. The ball fell on black 26 times in a row and gamblers lost huge amounts of money betting against black. The chance of black or red is always the same with each spin.

The hot hand phenomenon is considered a cognitive bias strongly linked to data and occurs when greater chances of future success are based on previous successful outcomes. This fallacy originated from basketball where players are perceived to be more likely to land a successful shot if their previous shot was successful. Recently, studies with modern statistical analysis claim to have found some evidence for the existence of "hot hands" in some sports.

Hawthorne Effect focuses on the changes in behavior individuals display as a result of them being aware that they are monitored or observed. This is named after a famous pychosociology experiment that aimed at proving that changes to the working environment (working hours, lighting, etc.) would increase the productivity levels of factory workers. Productivity levels did increase but as a result of workers being more motivated as they knew they were under observation.

McNamara Fallacy is about losing sight of the bigger picture and relying on only quantitative measures while ignoring all other variables. Robert McNamara, US Secretary of Defense (1961 - 1968) is viewed as the coordinator of the Vietnam War. He implemented statistical process control methods as quality control measures for war operations. His efforts were focused on taking only the enemy body count as a measure of success in the Vietnam War without taking into account other insights like public opinion or sentiment of the U.S. public. To this day his efforts remain controversial since he used a biased measurement success. Be careful what you measure and how it fits into the bigger picture.

Social desirability bias relates to the way in which respondents are prone to answer questions and share opinions in such a way that will foster acceptance and likeability. We are wired to want to feel liked and accepted. In this respect, people tend to share personal views in a more positive, socially acceptable light. To avoid this bias we may encourage respondents to project their beliefs and behaviors on a third party. Linked to this bias we also encounter the Sponsor bias which occurs when respondents know or can intuitively guess who is the backer of the research and their feelings and opinions on that organization will impact their answers.

Human nature biases

Everyone is uniquely biased. Recognize your own fallacies and biases. These are especially dangerous as you may come to dismiss or invalidate accurate data results bases on your own biased convictions or faulty intuition. All biases come from mental shortcuts that guide us to conflict resolution to either gain pleasure or avoid pain.

Conviction bias occurs when you are well invested in defending your idea and you feel that the efforts made in defending are a substitute for objective research.

Appearance bias is also known as the Halo Effect is an error of judgment based on prejudice and stereotypes by which a person, product, or brand is perceived to be good or bad based on appearance. Therefore, a well-groomed and attractive person is seen as having good intentions or being better or more competent.

Group bias or Groupthink is a phenomenon originating from the desire to avoid conflict and preserve the group dynamics and balance. Therefore, individuals may reach an often wrong consensus on sensitive matters with no rational basis, critical thinking, or evaluation of alternatives.

Blame bias or the "attribution error" is all about how we tend to attribute blame or responsibility for errors. When something goes wrong for us we tend to blame that on circumstances and external factors, yet, when something goes wrong for others we tend to attribute that to internal factors such as their character or personality.

Superiority bias or self-enhancement bias is all about how we tend to perceive ourselves as likable, ethical, and rational as compared to others. We tend to overrate positives and underrate negatives in relation to others and we always position ourselves above the average.

This is Liviu. In short, the insights guy.

Research & Data Visualization professional with background in management, marketing and advertising. Market Research & Advertising MA with trainer certification. Coordinated insightful research, reports, presentations and white papers with smart data analysis and editorial quality insights. Experienced data projects and data teams manager overseeing all stages from proposal to delivery.

Sign up to a free, email mini-course series on the essentials of data literacy, data storytelling and information design! Get in the right mindset and attain critical thinking skills in using data. The information is so condensed and easy to go through you can literally learn it while sipping your morning coffee. ☕