profile

Hi! I'm sqlbelle!

Exploring the data reality gap - Learn data with sqlbelle 2024.05.04 edition

Published 15 days ago • 2 min read

Hello Reader,

I hope you had a great week this week. Here are the data tidbits for this week.

Tableau Tip - Crosstab

Here is a simple but overlooked Tableau Tip. Look at your data in crosstab.

Benefits of Crosstab view:

  • Provides a clear view of raw data numbers, which can help in your validation processes.
  • Particularly useful for verifying complex calculations such as table calculations or LOD calculations.

In Tableau, you can right-click on your view name (the tab at the bottom of the screen) and select Duplicate as crosstab.

Alternatively, you can go to the Worksheet menu at the top and find this option there too.

In addition to crosstabs, don’t forget you can see your underlying data a few different ways:

  1. View data from the sidebar shows you underlying data from the data source

2. View data from a specific mark shows you the underlying data that comprises that mark

Data Reality Gap Pitfall

Have you ever heard of the “Data Reality Gap”? I first encountered this term when I read Ben Jones’s book “Avoiding Data Pitfalls,” and it has stuck with me ever since.

The “data reality gap” refers to the discrepancy between the data as recorded, analyzed, or interpreted vs the actual, real-world conditions or behaviors the data is supposed to represent.

It means whatever data you’re working with - it’s just a slice of reality. There will be things that are not captured in your data, which means the data is incomplete.

Some reasons for this gap can include:

  • unavailability of data
  • misinterpretation of data
  • bias in data collection - i.e., someone would have decided which information was worth collecting, which data points were not
  • outdated data

While no data set can ever be complete, we need to acknowledge the limitations of our data in our analyses. Acknowledging this gap ensures more grounded analyses.

Here are some examples where you could see data reality gaps:

  1. Marketing Campaigns. There could be problematic recommendations if the analyses rely on outdated consumer interest data.
  2. Customer Service. There can be misaligned service improvements if only digital feedback is analyzed, ignoring verbal customer feedback.
  3. Retail. Retailers risk overstocking products and not selling them if analyses are based solely on past sales data, ignoring emerging trends.
  4. Energy Sector: The data reality gap appears when planning based on historical consumption patterns without considering renewable energy adoption rates.
  5. Education Sector: The data reality gap can lead to outdated curricula that don’t match current job market demands.

Another popular piece often cited on the topic of “Data Reality Gap” is the WWII Aircraft Analysis.

Here is a short rundown:

  • Initial Analysis: The U.S. Air Force analyzed bullet holes in planes returning from missions to decide where to reinforce armor.
  • Initial Approach: They focused on adding armor to areas with the most bullet holes, like wings and fuselages, assuming these were critical hit points.
  • Critical Oversight: This method, however, ignored planes that didn’t return from missions, leading to a flawed strategy based on survivorship bias.
  • Survivorship Bias Explained: Survivorship bias means the analysis only included data from surviving planes, missing critical insights from planes that were shot down.
  • Abraham Wald’s Insight: Abraham Wald proposed reinforcing areas without bullet holes (engines and cockpit) on returning planes, as hits there likely meant a plane wouldn’t return.
  • Outcome: Reinforcing less-damaged areas significantly improved mission survival rates, demonstrating the importance of accounting for unseen data.

It's a wrap.

That's it for now.

Remember, the journey through data is paved with questions, not just answers. The right question can change the way we see the world. Keep asking, keep learning.

Until next time,

Donabel

Hi! I'm sqlbelle!

Weekly bite-sized data tips, lessons and practical tutorials. 5 minutes to read.

Join 4.4K subscribers who receive weekly, bite-sized data lessons, and practical SQL and Tableau tutorials | Subscribe for additional resources, and start with free tutorials at youtube.com/sqlbelle

Read more from Hi! I'm sqlbelle!

Hello Reader, Greeting here Tableau Tip - working with untidy data Imagine you have to work with a lot of text, for example, needing to extract the hashtags or reformat phone numbers easily. Does Tableau have the functionality to help you? Sample untidy data The answer is yes - with regex. What is Regex? Regex, or Regular Expressions, is a powerful way to work with text. Think of it as a search tool that goes way beyond finding simple text matches. It finds patterns within text, which can be...

8 days ago • 2 min read

Hello Reader, Hope you are having a wonderful day. Here are the data tidbits for this week. Tableau Tip - Use your own color Do you know how you can use your own colors in Tableau? You can use a double-click trick: Double-click any entry in your color legend to bring out the “Edit Colors” window Double-click any entry under “Select Data Item” to bring out the “Select Color” window From the “Select Color” window, you can change color in the following ways Color slider Pick screen color RGB...

22 days ago • 1 min read

Hello Reader, Hope you are having a wonderful day. Here are the data tidbits for this week. Tableau Tip - Pasting Parameter Values Did you know that if you have a string parameter that accepts a list, you can copy values into your parameter? That’s right, you don’t have to type the values in one by one. You can copy the values: from an existing field’s values from clipboard You can copy and paste from a text editor or spreadsheet Copy = Ctrl + C in PC, or Cmd + C in Mac Paste = Ctrl + V in...

29 days ago • 1 min read
Share this post