Taylor Rodgers
  • Home
  • Books
  • Free Tools
    • Requirement Doc Templates
    • Quality Checking Guide
  • Blogs
    • Data Science
    • R Programming
  • Consulting
    • Statistical Consultant
  • Contact
  • Home
  • Books
  • Free Tools
    • Requirement Doc Templates
    • Quality Checking Guide
  • Blogs
    • Data Science
    • R Programming
  • Consulting
    • Statistical Consultant
  • Contact

The Forgotten KPI That Tells the Best Story with Data

12/11/2019

2 Comments

 
Picture
There’s a mindset in our profession that we have to make things easy for our audience. We don't want them to think too hard while reading our dashboards and analyses.

But do we take that attitude too far? When trying to make our work more accessible, do we dumb things down too much?

I've built reporting solutions for multiple organizations now and the one complaint I hear the most is that the data is useless. Dashboards never give much insight and the KPIs they display are usually found in the source tool.

While it's nice to have all that data in one location, dashboards never do what stakeholders are wanting the most – tell a story.

Telling a story is a bit of a cliche with data, but most people want to open a dashboard and see what’s unusual about the data they’re analyzing.

There's a great metric that gives the audience that ability and allows them to tell that story themselves. Only...it's not always intuitive for them...

That metric is variability.

The Forgotten KPI: A History of Variability in Business Analytics

A famous statistician and business thinker named Edward Deming built a management philosophy around the concept of variability. He believed that decreasing variability meant improving the quality of both the company and the products it produced. That would then increase both sales and profitability.

Deming’s philosophy became popular with Japanese manufacturers after WWII. By the 1970s, they were outperforming their U.S. competitors.

When car manufacturers looked at Japanese made auto parts, they found the only real difference related to variability. Their individual parts, from the very screws to the size of an engine block, had far less variation than American made goods.

During the 1980s, U.S. companies finally adopted Deming's philosophy to compete with Japan on quality. Much of that was building systems to understand and decrease variation where possible. Their products did improve significantly and now there's a far smaller gap in quality ratings between U.S. and Japanese cars.

This focus on variability has gone out of favor in management circles. Not because America no longer cares about quality, but because we did such a good job implementing his philosophy. Quality is less of an issue now with consumers.

Deming didn’t believe that variation in itself was bad. But he did think it was important to understand what variation was normal in a given environment. That would allow management to fix what was fixable and work around what couldn't be fixed.

What Is the Best Measure of Variability?

There's a few ways to measure variability, but the one I think of most is standard deviation.

Standard deviation is an aggregate measure of distance from the average for individual data points.

You can often see standard deviation visualized on a normal distribution curve like below:
Picture
68.2% of all data points would fall within one standard deviation of the mean for a given sample. 95.4% of all data points would fall within two standard deviation.

Why Would We Care About Variability?

Adding variability to an analysis or dashboard is like adding shadows to a flat drawing. It gives it far more depth and tells you a much more interesting story.

For example, imagine you’re a hospital manager. You find out that your doctors and nurses are not washing their hands consistently when entering a patient’s room.

As a matter of fact, they only sanitize their hands for 51% of patient visits.

That’s a big health concern since disease can carry from one room to another via the staff’s hands.

To fix this, you do a big hospital-wide campaign that reminds the medical staff to always sanitize their hands when entering a patient's room.

And the numbers seem to improve. Just one month later, the medical staff washes their hands 63.5% of patient visits.

But should you celebrate yet?

Your data analyst informs you that the standard deviation of handwashing has increased from 1.17 to 2.92.

What does that mean? Even though the average of handwashing has increased, variability in handwashing has also increased. So there's a wider spread between staff members washing their hands and staff members that don't. (I hope hospital administrators don’t have this problem in real life.)

To be sure, there was still an improvement, but calculating variability reveals there's still work to be done.

How Can We Use Variability in Reporting?

The first step is to acknowledge that our audience is smarter than we give them credit for. We often patronize them by using simple KPIs, like average conversions or total profit. Let’s go further and use standard deviation (or other variability measures) in more dashboards and analyses.

The key thing to do though is make it easy to understand what variability means to the data they're analyzing.

That's where data visualizations come into play – and you have to get creative on how to use them.

Let me provide some examples.

Example One: How Much Should an Airbnb Host Charge?

I built a dashboard for a hypothetical Airbnb host in the Denver area which perfectly illustrates how to use standard deviation in a dashboard.
Picture
One of the things you notice is that I put the normal distribution curve right on the dashboard. The hypothetical Airbnb host could select their rental type, neighborhood, and price. They could then see how their unit’s price falls in comparison based on that criteria.

For example, $240 is too high for this neighborhood:
Picture
But it's fairly normal here:
Picture
This comparison would've been lost had we simply subtracted the Airbnb's host price from the average. If a host noticed there's wide variation in price range, they might feel more confident in asking for a higher rate than the average. But if there wasn't much variation, it would become far more risky to ask more than the average.

The best thing about this dashboard is that it’s intuitive. Even if the audience doesn’t know what standard deviation means, the normal distribution curve and the coloring of key data points tells a story of just how varied pricing can be.

Example Two: Is Staff Handwashing Improving?

The other way to utilize variability in a way that makes sense in reporting is to show its improvement (or lack thereof) over time.

There's two ways to do this. The first one I'll show you uses standard deviation.

Let’s go back to our example of medical staff washing their hands. You see down below the average handwash rate is increasing overtime.
Picture
Now let’s add standard deviation:
Picture
It looks like standard deviation is increasing, along with the average. So even though the average handwash rate is improving, the staff is not doing it consistently.

Now let's say that the hospital manager sees this and decides to tweak the message he or she is sending the staff. They reiterate the importance of handwashing and provide more evidence of its benefits.

The hospital manager continues to monitor the data over the next few months and sees the trends below:
Picture
It looks like the increased focus on handwashing has improved over time. Bot the average increased and there’s more consistency among the staff.

But What If My Stakeholder Just Doesn’t Know What Standard Deviation Means? Here’s An Alternative

The increased focus on data analytics in the business world means business leaders will need to understand standard deviation. Having said that, you’ll still find that some stakeholders don’t know what that means.

There's an alternative visualization you can use that illustrates variability – the box-and-whiskers plot.

Down below is the medical staff handwashing example used earlier. Only this time, I use a box-and-whiskers plot, where each dot represents a medical staff member.
Picture
This shows variability is growing around the middle of the year, before decreasing. If the renewed focus on handwashing was effective, then they would expect to see both the median increase and the distribution decrease. That begins to occur in September.

Last Thoughts

Variability gives context to the data you’re analyzing.

You could replace Airbnb prices and hospital handwash rates with a whole host of business problems (employee billable hours, c-store profitability by location, job performance metrics) and you’ll get a much more interesting story.

Take more risks with your stakeholders. Pitch reporting projects that utilize variability. You may find that they’re more open to the idea than you realize.
2 Comments
Lianne link
12/11/2019 02:57:45 pm

Thanks for your post Taylor! I’d like to try running your Airbnb example on my own Airbnb. How did you scrape the data from Airbnb.com?

Reply
Taylor Rodgers
12/11/2019 03:35:15 pm

Thank you for the comment!

This website has a data for various cities across the world:
http://insideairbnb.com/get-the-data.html

I have an Airbnb too and I'm still waiting on them to release it for my own city. Hasn't happened yet. :(

I attempted to do a regression analysis to see if price was an good predictor of how many bookings a host would receive. Sadly, these data sets don't have # of bookings. They do have # of reviews, which I attempted to use as a substitute for bookings. I didn't see a strong relationship though.

Reply



Leave a Reply.

    ABOUT

    A blog about the non-technical side of data science.

      SUBSCRIBE

    Confirm

    ARCHIVES

    April 2022
    March 2022
    October 2021
    September 2021
    August 2021
    March 2021
    February 2021
    January 2021
    November 2020
    August 2020
    June 2020
    May 2020
    April 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    August 2019
    July 2019
    June 2019
    May 2019
    March 2019

    RSS Feed