There’s a popular phrase tossed around nowadays called data-storytelling.
I have a love-hate relationship with this topic. For one, I think it’s a bit of a corny term. It’s often said that a good analyst “tells a story” with the data, even though much of our work doesn’t have a narrative to it. I think the more accurate phrasing would be that good analysts “answer a question” with the data. Then again, “data-answering” doesn’t have the same ring to it.
Communication Isn’t Just About Storytelling
I think the big problem with the term data storytelling is the implied lack of focus on other communication issues common in data science. While many data storytelling advocates do discuss more broader communication principles, the term itself implies the use of narratives to convince your audience to believe your analysis.
The only problem is… not every data worker in analytics delivers analyses. Some build machine learning algorithms, which operate in the background of other applications. Others build automated reports while even more build the data warehouses to support that reporting. Only a small section of the analytics community actually produces analyses.
That means for the majority of data workers, it’s not about storytelling. It’s about reducing ambiguity surrounding goals, roadblocks, and strategy.
This is an important skill to learn. Whenever you have more and more people working together, it's easy for project expectations to become less clear. One person hears something differently than another person. As the project goals gets re-translated, the original intent is diluted and the final product doesn't at all resemble the original ask.
Hence, the greatest communication need for data workers isn’t convincing their audience about the conclusions of an analysis, it's just better talking and listening in general!
When the data scientists, data engineers, report developers, and data analysts focus on improving their every-day communication, such as in emails and meetings, the quality of their work increases. There’s fewer project re-works, fewer surprises, and less wasted time on common misunderstandings that arise in a highly complex profession.
And when it comes to more traditional data analysts – those that deliver insights – narrative building isn’t that important. At least in my experience. It’s still a handy skill to have because it does build excitement, but I wouldn’t say it’s necessary.
If you effectively communicate with your stakeholder early in the planning process, they won’t need to be convinced at the end with a narrative. You will have already gained their trust by that point. So something that would normally be a "surprise" insight won't create the same skepticism in the results it would with a poor communicator.
That being said, narratives are handy for an audience you never spoken with before. If you’re delivering a webinar or writing a book, narratives are a helpful piece of flare that keeps your audience engaged. I myself use narratives in those situations, but I almost never use them in a work setting. Believe me, I tried and most stakeholders find it distracting.
Good Communication is About Finding Balance
Many communication tips are dependent on the medium, the venue, and the audience. That means what’s true for a webinar is not true for writing a blog post. And what’s true for a meeting with one stakeholder is not true for a meeting with ten.
That being said, there’s one universal truth to good communication that applies to almost every situation…
Good communication is about finding the sweet spot between too much detail and too little detail. It’s about emphasizing the message, but also providing just enough context so that the audience understands the why behind the message.
In my experience, most people either provide so many irrelevant details that the message gets lost. Or they provide so little context that the message isn't understood or believed.
And I absolutely promise you – if you learn to achieve that balance and apply it to every interaction, whether it’s in email, in-person communication, meetings, etc., you will be thought of as a good communicator.
If I had to make an educated guess, I would say most people provide too much detail. And I would say that’s also true for most data workers as well. In my opinion, this is actually a sign of good quality for those people. It shows they have empathy and are trying their best to give their audience the most information. But that natural inclination to give every detail doesn't serve that goal.
What Does Too Much Detail Look Like?
There’s a lot of examples I can provide for what too much detail looks like, but the best and funniest I found would be George Carlin’s skit I Like People (start at 63 seconds):
George Carlin impersonates a person who really does have an interesting story to tell about finding a dinosaur fossil. But they get hung up telling the audience when the story occurred, which is a fairly arbitrary detail.
This happens all the time in data science communication, not to mention business in general.
For example, a data engineer may need to provide a data dump to an executive who is delivering a presentation to the CEO on Friday. On Wednesday, the data engineer realizes that he can’t provide that information because of some unexpected issues with the database.
Many data engineers would send the following email:
The database credentials say I have insufficient access. I’m not sure why that is, considering it shows I’m a reader in that warehouse. I looked on stack overflow and it says it’s a role issue, so we’ll see if that fixes it.
It also seems like the inventory table and the product info tables don’t have keys to join them. So many of the products you need reported would have to be hardcoded. That wouldn’t be something I could fix until we get the credentials. Although some data could probably still be reported. It does look like there’s mapping documentation in Teams. Do you know if it’s still accurate?
I could probably get some data to you by Friday though. What level of detail do you need?
That’s pretty confusing isn’t it? The message that Anna needs to hear is that there’s a "roadblock preventing the data engineer from delivering the data." But the data engineer provides the explanation first and makes it less clear what the key takeaway is.
You might think that’s an extreme example, but it happens all the time in analytics.
What Does Too Little Detail Look Like?
Too little detail is when someone only delivers the message, but provides no context. The audience technically hears what they need to hear, but the lack of context doesn't help them understand the message.
Here’s a common example I’ve run into in almost every job I’ve had. I may send an IT support person the following message and get the following response:
Taylor Rodgers: Hey, it looks like the Linux server is down. Have you guys looked into it? Do you know what happened? Thanks!
IT Support: There was an update this morning.
Now keep in mind, they did technically answer my question. It just doesn’t provide any context. For that reason, it sounds almost dismissive and doesn’t really answer my concerns for my situation.
I see this happen a lot with technical professionals – both in data science and computer engineering. Some actually make an active attempt to write and speak in such a way. They think it makes their communication more efficient, but really, it just makes it more confusing and forces the audience to ask more follow-up questions about things that could’ve been told upfront.
A mere three extra sentences conveys the exact same message, but better explains the “why” behind the message and demonstrates more empathy towards me:
Taylor Rodgers: Hey, it looks like the Linux server is down. Have you guys looked into it? Do you know what happened? Thanks!
IT Support: There was an update this morning. It was supposed to be completed at 6:00am, but we discovered a bug. We’re working on it right now. It’ll probably be another hour or so before we get it back online. Sorry for the hold-up!
Much better, isn't it?!
Let's return back to our example of the data engineer who is helping an executive with a data dump. Let's say this time though he overemphasizes the message and provides little context.
Unfortunately, we probably can’t deliver your data request by Friday.
Now keep in mind, this is the message she needs to hear. But the lack of context will make her feel dismissed and she won’t understand the why behind the issue. What’s more, she would have to ask follow-up questions to find out what the problem actually is.
You may think that’s an extreme example, but I’ve had many interactions like this. One time, I had to send a support person ten additional emails asking follow-up questions because they kept sending one sentence emails that all ended with “regards.”
Everything he told me was factual, but he put the entire burden on me to ask follow-up questions because the lack of context made it unclear what I was supposed to do. Despite his attempts to make communication more efficient by making it concise, it ironically led to the opposite problem.
How to Provide Balance
The easiest way to provide balance is to mimic the inverted pyramid of journalism.
If you ever read a newspaper, most news articles follow the same structure. The headline tells the exact story. The first couple of paragraphs answer the who, what, where, why, and how. The body of the article provides additional context, some background information, or interesting details. And the last paragraphs discuss broader trends or other details that are loosely related to the story.
If you look below, I actually illustrate this with a fake news article I wrote:
The Kwik-E-Mart on 9th and Kentucky St was robbed around 9:30am yesterday morning by an armed suspect. Police are asking for help in identifying the suspect.
The cashier described the suspect as a white male with red hair, approximately 5’10”, and a snake tattoo around his right arm.
The suspect initially arrived while there were ten other customers in the store. Security cameras show he pretended to browse the slurpee section until they left, which is when he proceeded to threaten the cashier with a pistol.
This is the third robbery at the Kwik-E-Mart this week. The Springfield Gazette attempted to reach the owner to ask why they haven’t provided better security, but no response was received. The police are still investigating the other robberies.
You can easily apply this same method to your role in data science. Whether you’re informing a stakeholder of a setback, writing a Jira comment, or presenting an analysis – this structure works.
Here’s how you could report the results of an analysis with the inverted pyramid of journalism:
Our analysis revealed that the time of day explained most variations of our digital ad campaigns. This relationship held even though we tested or controlled for the various headline, photograph, and price options.
We recommend running ads during the afternoon between 2:00pm and 5:00pm, when our target audience is most likely to read it. In our test, we showed that this increased clicks by 20%.
In addition, we found that the headline “What will you wear this weekend?” increased clicks by 10% over “Wear something nice this weekend.”
The photograph and promo did not have a significant impact. Since the promo would detract from the revenue generated, we don’t suggest including it, as its impact is negligible.
As you can see in this example, we put the most important message at the top. And then we provide context, as well as other, less important details the further we move through the text.
What’s ironic is that this analysis may have taken weeks to produce, but the key takeaways and results were reported in a few paragraphs. And that’s okay! It finds the right balance between message and context. That makes it way easier for the audience to act upon it.
Here’s the Email the Data Engineer Should Have Sent
If we go back to the earlier example of our data engineer who needed to inform an executive of the delay of the data, they could’ve used the inverted pyramid to respond with:
Unfortunately, we may not be able to deliver the complete data set you need by Friday. Without getting into too much detail, we’re having issues with database access and there’s inconsistent labeling on the products, which means we’d have to hard code them one by one to report sales figures.
I understand you needed to report this to the CEO on Friday. While we can’t deliver the entire data set, we can deliver some of it. What would be the most important areas to focus on?
I do want to say that we are working with the necessary people as fast as possible to resolve the issues I described. We may get lucky and have a resolution by Friday, but just in case, I’d like to make sure I get you the most critical items.
Thank you and apologies for the inconvenience,
Way better, isn’t it? This emails works because it gives the details in order of important. The key message (which is “there’s a delay”) is at the top. It provides a quick explanation of why, without getting into too much detail.
The second most important message (we can still get you some data) is in the middle.
The third most important message (we’re trying our best to find a solution) is last.
Anna may not like that she can’t get the data she needs, but you conveyed the message, explained the context, and offered a compromise. You also expressed empathy.
And that’s the ultimate thing about communication. Audiences appreciate empathy. They may not say that explicitly, but learning to find that balance shows that you’re thinking about them. You’re working to help them understand the message.
Things to Remember
Remember: start with the message and then add context. As you move through your statement, presentation, or email, you'll start with the most important information at the top and move downwards towards less important takeaways or details.