Welcome to our blog. Here you will find announcements about upcoming events and recaps of where we have been, workshops we have taught, and people we have met.
Years ago when I first worked at Mass General Hospital (MGH) in Boston, I had a really great boss, Dr. Dan Rosenthal. Like many smart people, he was also very funny. I especially loved his ability to use adjectives in a way that made you think he was going to tell you one thing, only to hit you with something entirely different.
Once, for example, I helped present a project that my team had been working on. In reviewing the results, Dan said, “Kathy, that was a spectacular…failure.” Ha! Very funny and very true, it had been just exactly that. Another time, I asked how he’d done in the half-marathon he’d run over the weekend. “Kathy, people are truly fantastic…liars,” he said. “At the 10-mile marker, everyone told me how great I looked – and that I shouldn’t worry: the finish line was close.” In my book, that is LOL good.
I was reminded of this last story while pulling together a recent presentation on data visualization – specifically about how data and information can be less than truthful, even downright misleading, when graphical displays are used incorrectly. A few examples follow.
A Quaker Oats ad claims that the cereal is a “Cholesterol Hunter,” and that by eating oatmeal every day for 30 days you can significantly reduce your cholesterol level. Here’s the trickery in the bar graph shown to support this claim: the scale begins at 196, and is so small that the change from week one to week four looks much larger than it actually is. The Center for Science in the Public Interest called Quaker Oats manufacturer General Foods out for its misleading claim and illustration, making it clear that it may take as many as four oat-product servings a day to get any benefit, and adding that a cholesterol level of 196 is still considered high.
As I’ve written elsewhere, bar graphs must start at zero to prevent this sort of distortion. You can read more about why, and see alternative displays here.
There is also this example, from a Masters of Media blog post by Joram Binsbergen of the University of Amsterdam. He discusses an infographic used by the New South Wales Ministry of Health to illustrate an apparently significant increase in the number of nurses recruited by that country’s healthcare system from 2008-March 2013. On closer examination, however, the scale is completely off.
Four “people” icons on the far left of the graphic represent 43,147 nurses; on the far right, forty such figures stand for 47,500 – making a 7% increase look like a nearly 7,000% one.
As Binsbergen points out in his post about this display, it is useful to consult Edward Tufte, who introduced the “Lie Factor” in his book The Visual Display of Quantitative Information: “The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.”
My last example, from Andrew Gelman’s blog, “Statistical Modeling, Causal Inference, and Social Science” (sounds like a snooze, but it’s not – honest!), uses a misleading graphic in a story on medical fund-raising. Purporting to show how much money is raised to help fight specific diseases as compared to the mortalities attributable to those diseases, the zippy graphic is beyond distorting – it devolves into “hot mess” territory.
If for example you look at the large purple bubble on the right and match it to the key above it, heart disease appears to be the leading cause of death (596,577 deaths). I say “appears to be” because the key isn’t labeled to represent mortalities; rather, it says, “Heart Disease|Jump Rope for Heart (2013).”
Now look at the much smaller purple bubble on the left, labeled $54.3 million. If you’re puzzled, that’s a good sign: confusion signals awareness of a large, awkward problem. That left-hand bubble – it would seem – represents just one fundraiser for heart disease (“Jump Rope for Heart”), as compared to all deaths from heart disease.
A quick internet search of “heart disease” led me to this summary document, which made me dismiss the illustration in its entirety.
You may never create a graph, chart, or table, but you will, like most of us, assuredly consume, even depend on, graphic displays. As with all consideration of important and complicated information, I encourage you to slow down, take a breath, and ask yourself, “Does this make sense? Can I believe what is being presented?” And what is more important, “Can I learn something true here, or make a confident decision based on what I think I’m seeing?”
Remember, even information displayed in a pretty graph can be incredibly misleading. Or, as Dan Rosenthal would probably say, “fabulously terrible.”
Greetings from Bustins Island, Maine, where every summer I mix a batch of watermelon mojitos, climb into the hammock, and reflect on the year to date, the work ahead, and the Sox’ standing in the ALE (as of 8/25, first place – 5 games ahead of the Yankees).
On the professional front, I am heartened by the fact that data is increasingly being analyzed and used to improve our health and systems of care, and by growing awareness of the best practices of data visualization and visual intelligence. There is all the same a lot of work to be done!
The good news I’m hearing from clients is they’ve improved their performance on many quality measures and are consistently meeting their goals. What remains to master is how to monitor their results using something other than a bar chart.
Though useful and easy to interpret most of the time, bar charts can make it difficult to see very small differences between multiple measures on a dashboard or report; and they can take up a lot of space better used to display additional important metrics. Clients often ask, pleadingly, if “just this once” they can ignore the best practice of starting a bar graph at zero. My inner Dana Carvey kicks in with “…wouldn’t be prudent …not gonna do it” (yes, it sounds funnier when he says it). Won’t be auditioning for the host slot on SNL anytime soon, though, even with help from those mojitos; so back to visualizations.
As you can see in this first display, the points showing actual performance for all four measures versus their respective targets appear quite closely grouped, so it’s hard to tell them apart. This hindrance in turn tempts us to break the best-practice rule of starting a bar graph at zero, and instead start it at a higher value. To remind ourselves why this is a bad idea, let’s review that rule.
Bars display and compare the size of different values; if they begin at a value greater than zero, the true size of what they are designed to measure is distorted. Starting the scale at a value > 0 and decreasing the increments across the chart or down the bars (that is, along the X or the Y axis) to magnify the values can make the differences displayed seem much larger than they actually are. (Take a look at my earlier post on this topic here. You may decide to eat less oatmeal after you do).
This illustration shows the effect of interval distortion.
Consider these options for resolving this conflict between visualization guidelines and the need for a quick fix.
Points and Lines
Using points and lines to display values like these can be a good alternative to a bar chart because that frees us from having start the scale at zero. The points and lines format also allows us to display viewer-required details in far less space on a dashboard or report than does a bar graph.
One of my favorite ways to display performance versus target data on a dashboard or report is with a deviation graph, which illustrates the relative difference between two values. Though each performance metric may have a different target figure, this graph shows only the distance from target for each, so we can line them all up one after the other (see below). A deviation graph is neat, clear, and space-saving. (Two more articles on this useful visualization appear here and here.)
Small Multiple Line Graphs
Depending on viewer requirements, you might also consider displaying the kind of data we’ve been discussing in small multiple line graphs. Unlike bar graphs, line graphs do not have to begin at zero, because the line isn’t comparing the sizes of two values. Instead, lines display values changing or trending over time.
If for example you need to show how your organization has performed over time compared to each measure’s unique target, you can create separate graphs for each, careful to make the scales begin and end with the same values, then arrange them in sequence. This technique makes it easy to see how the measures have changed, and compare measures both to targets, and to each other. (Two previous small multiples posts are here and here.)
In the past decade or so, the health and healthcare industries (among others) have increased their awareness of the need to discover and embrace best practices of caring for patients and delivering services. This has also happened with data visualization.
As we become more familiar and comfortable with the science behind and research on visual intelligence and cognition, we too must embrace these best practices to ensure that our visualizations are correctly displaying the data, that our message is clear, and that people are informed, inspired, and above all moved to action to improve the healthcare we help deliver.
Speaking of caring for people, the hammock is beckoning. I need a nap before tonight’s game. Life is good.
P.S. For those who ask, “What about wide variation data?” – a different, albeit related subject – I call your attention to a not-so-recent post I wrote, here.
I sincerely wish I could avoid typing the following: it is July 2017, which means it is time again for my annual book recommendations, as well as a swing in the hammock with a watermelon mojito (no regrets there).
Seriously? I’m going to hire a search and rescue team to find my life and bring it back to me (the edited version, of course), because I have no idea where it has zoomed off to! I know I was just celebrating my 21st birthday a few moments ago… denial is more than a river in Egypt (or so I’m told).
Here is some of what I’ve been reading. I think you’ll enjoy these titles, too.
I know: you’re probably wondering why this book is on the list. Although Ron Chernow’s Alexander Hamilton is all the rage these days (and for good reason – I read the book and loved it; helped to pass the time waiting for the next millennium to land tickets to the Broadway show), there is a wealth of other great books about the history of our country, including a special-interest subset recounting the lives and work of champions of universal healthcare for U.S. citizens. Until I read David McCullough’s book, I thought Lyndon Johnson had been the original promoter and architect of the Medicare program: I had not fully understood the huge role that Truman played.
Additionally, Truman’s famous defeat of Dewey in the 1948 election was extraordinarily fascinating in its demonstration of how pollsters can get it so wrong (yes, history repeats itself). The significant political repercussions highlighted the pitfalls of survey samples and results, and the need for rigor in this type of work.
The author, Causalytics founder Herbert I. Weisberg, Ph.D., weaves engaging stories about important thinkers, and how the problems they worked to solve using statistical methods helped propel scientific research. But this book is more than just a historical view of these efforts: it’s also a cautionary tale about the mountains of simplified studies and statistics that result in frequent reversals of scientific findings and recommendations.
As the title suggests, the fallacy of regarding probability as the full measure of our uncertainty is contributing to an oncoming crisis. Weisberg says, about clinical research and care, that our current methodological orthodoxy plays a major role in deepening the division between scientific researchers and clinical practitioners. Prior to the Industrial Age, research and practice were better integrated. Investigation was generally more directly driven by practical problems, and conducted by those directly involved in solving them. But then as scientific research became more specialized and professionalized, the perspectives of researchers and clinicians began to diverge. In particular, their respective relationships to data and knowledge have resembled each other less and less.
If you work with statistical methods, especially probabilities, or you have to understand them well enough to explain them (and their limitations), then this is a really top-notch book that you should seriously consider taking the time to read.
I state my disclaimer right up front: my HealthDataViz team and I contributed to this book. That is of course what makes this title a particularly great addition to your health|health care reference shelf (no brag; just fact)!
Each chapter follows a standard layout that I like a lot. Each begins with a summary of the big picture that the dashboard addresses, followed by the specific metrics displayed and related scenarios illustrated. “How People Use This Dashboard” is next, supported by different visuals on the dashboard as examples. A “Why This Works” section rounds out the chapter.
The authors also include a data visualization best practices summary by displaying what NOT to do – very useful! No red, yellow, green pies, donuts, bubbles, or word clouds, please!
If you have never read this book, this is the summer to do so. I absolutely love it, and any time the movie based on it is on television, I drop everything and watch (just ask my husband, Bret – there is no dissuading me). A Civil Action is the true story of the quest by a somewhat idealistic young lawyer to collect damages from two corporate giants, Beatrice Foods and W.R. Grace, for allegedly polluting the water in Woburn, Massachusetts, a Boston suburb, with carcinogens. The case considers a cluster of leukemia victims in Woburn (the disease claimed the lives of at least six children), and the tremendous challenge of reconciling a preponderance of experiential or circumstantial evidence with scientific results. How do you prove causation in the courtroom? Is it possible or even correct to try and do so?
I love, love, love this book and am certain it will find its way into my summer bag yet again this year – a perfect read for a few hours in the hammock.
Coming Soon!! Tableau for Health and Healthcare, v. 3
Many of you have been asking, and the answer is a resounding YES! The HealthDataViz team has been hard at work updating our Tableau manual (a very special thanks to Janet, Ann, Marnie, Dan, Jim, Peter, and our very own GrammarLady, Anne). Data sources and examples have been updated to reflect what we’ve learned while training folks to successfully use Tableau, and there are some new tips and tricks. It’s due out in the coming weeks, so stay tuned for our announcement.
Finally, let me take a moment to thank all my faithful subscribers and our clients. You are the best and biggest reason for what we do, and we are deeply grateful for your support. We look forward to presenting engaging new ideas and fresh approaches, and collaborating on innovative projects with you.
I’ve been teaching a lot of data visualization workshops lately. Inevitably, when I reach the part of the day when I ask participants how they gather requirements to build a monitoring dashboard, I always get the same rote, data-analyst-centric response: “I ask my customers what questions they want answered.”
My job (or cross to bear; you decide) is to then firmly nudge them toward a new approach, one that requires them to ask instead, “What is your role and scope of responsibility? As you work in that role, what decisions must you make to achieve your goals and objectives?”
Dashboards exist to help people visually monitor – at a glance – the data and information they need to achieve one or more goals and objectives quickly and easily. This is considerably different from analyzing data to answer a specific question or to uncover potentially interesting relationships in that data.
With this construct about the purpose of a dashboard in mind, let’s consider examples of two different prototype Emergency Department (ED) dashboards designed using the same source data. We’ll ask end users to describe their role (position) in the ED, the scope of their responsibility there, and what summary information they need in deciding how to meet their goals and objectives. We’ll call this the RSD [Role, Scope, Decisions] approach.
Example A: Emergency Department Operations Manager
Here, the ED Operations Manager’s role and scope of responsibility are to ensure that patients arriving at the ED receive timely and appropriate care, and that the ED doesn’t become overloaded, thereby causing unduly long patient wait times or diversion to another facility.
Given these parameters, the chronological frame for the dashboard below is present|real time, and is focused on where and for how long actual ED patients are in the queue to receive care.
In the upper left-hand section of this dashboard is a summary ED Overload Score (70), overlaid on a scale of No to Extreme Overload. Under this summary are elements of the score: ED Triage (10 points), Seen by MD|Waiting for Specialty (10 points), Specialty Patients Waiting (20 points) and Waiting for In-Patient Bed (30 points). This summary provides a mechanism for the manager to monitor both the risk of overload and the key factors driving the score higher.
Additional information on the dashboard helps the Manager analyze (across census categories) the patient census, and see how many cubicles are currently in use vs. available for examination and treatment. Average wait times in minutes and by patient triage level in eight (8) categories such as Arrival to MD Evaluation (compared to a hospital goal), and ED Length of Stay (LOS) are displayed using bar graphs in the middle section. The lower left-hand display projects when additional cubicles will be available (blue signals available cubicles; orange, a shortage); the lower right-hand one shows information on patient wait times by sub-specialty.
All of this dashboard’s metrics are designed to help the ED Operations Manager identify active and potential bottlenecks, and to act to meet the objectives: delivering timely and appropriate care, and avoiding ED overload.
Example B: Emergency Department Executive Director
Here, the Executive Director’s role and scope of responsibility are to ensure that not only is the ED team providing timely and appropriate care, but that reimbursement is not forfeited because pay-for-performance (annual|contractual|third-party|value-based-purchasing) goals are missed.
In response to this role’s needs, display time frames include both Month to Date and Current and Previous YTD performance, allowing the Director to stack current performance against agreed-upon targets for metrics tied to third-party reimbursement, as well as potential opportunities for improvement.
At the top of the dashboard is a table of summary performance metrics (number of patients seen and treated; time in diversion due to ED Overload) for the current month vs. current and previous years, and change over time. The middle bar charts provide the Director with the current month and YTD performance compared to targets for the metrics often tied to third-party, value- based (pay-for-performance) reimbursement. The deviation graphs at the bottom of the dashboard provide context for monthly performance compared to targets trended over time.
In this dashboard, summary metrics help the ED Executive Director monitor overall performance, identify areas for improvement in delivering timely care and avoiding ED Overload, and ensure that reimbursement is not lost.
Shifting from asking your customers what questions they need to answer to asking them to describe scope, role, and decisions may seem like a distinction without a difference. It isn’t. Framing inquiries this way stimulates everyone to step back and examine what is required to support a universal, shared goal: acquiring the right information – at a glance – to work toward goals and objectives, and hit those targets, quickly, confidently, and well.
Guests in our home are often very generous with their compliments on my cooking skills. While I sincerely wish those compliments were deserved, the sad (and, okay, shocking) truth is that they are not.
I’m not a great cook: rather, I am an excellent assembler of food that other people have created. I know where to shop, and the way to put together terrific dishes, and I know how to pour a generous glass of wine (or three). These skills appear to convince people that I know how to cook.
Here’s another thing I’m great at assembling: fun, smart, wildly talented, highly collaborative, and productive professional teams. What’s my secret? I know that unicorns aren’t real.
Unfortunately many health and healthcare organizations, rather than working to assemble these types of teams, persist in hunting unicorns. They assume that one person can posses every skill required to create compelling and clear analysis and reporting.
These organizations need to stop the fairy-tale hunt, and start building data-analytics and communications teams. The idea that any one analyst or staff person will ever have every single bit of knowledge and skill in health and healthcare, technical applications, and data visualization and design required to deliver beautiful and compelling dashboards, reports, and infographics is just – well, sheer lunacy.
3 Tips for Building Data-Analytics and Reporting Teams
Tip 1: Search For Characteristics & Core Competencies
To build a great team, you need to understand what characteristics and core competencies are required to complete the work. Here’s where to begin:
- Curiosity. When teams are curious they, question, probe, and inquire. Curiosity is a crucial impetus for uncovering interesting and important stories in our health and healthcare data. Above all else, you need a team of curious people! (Read my previous post about this here.)
- Health & Healthcare Subject-Matter Expertise. Team members with front-line, boots-on-the-ground, clinical, operational, policy, financial, and research experience and expertise are essential for identifying the questions of interest and the decisions or needs of the stakeholders for and to whom data is being analyzed and communicated.
- Data Analysis and Reporting. Without exception, at least one member of your team must have math, statistics, and data-analysis skills. Experience with data modeling is a plus if you can find it; at a minimum, some familiarity with the concept of modeling is very helpful. The ability to use data-analysis, reporting, and display tools and applications is also highly desirable, but another more technically trained IT team member may be able to bring this ability to the table if necessary.
- Technical: IT & Database Expertise. Often, groups will confuse this skill area with data-analysis and reporting competence. Data and database architecture and administration require an entirely different set of skills from those needed for data analysis, so it’s important not to conflate the two. You’ll need team members who know how to extract, load, and transform (ETL) and architect data for analysts to use. And while you may sometimes find candidates who have both skill-sets, don’t assume that the presence of one means a lock on the other.
- Data Visualization & Visual Intelligence. Knowledge of best practices and awareness of current research is required to create clear, useful, and compelling dashboards, reports and infographics. But remember, these skills are not intuitive; they must be learned and honed over time. And although it is not necessary for every team member to become an expert in this field, each should have some awareness of it to avoid working at cross-purposes with team members employing those best practices. (That is, everyone should know better than to ask for 3D red, yellow, and green pie charts.)
- Project Management. A project manager with deep analytic, dashboard, and report-creation experience is ideal – and like the mythical unicorn nearly impossible to find. But don’t let that discourage you. Often a team member can take on a management role in addition to other responsibilities, or someone can be hired who, even without deep analytics experience, can keep your projects on track and moving forward.
Tip 2: Be Prepared to Invest in Training and/or External Resources
- Why? Because they don’t teach this stuff in school.
At present, formal education at institutions of higher learning about the best practices of data visualization, and state-of-the-art visualization and reporting software applications is scarce, and competition to hire qualified data analysts is fierce. As a result, you must be prepared to invest in training the most appropriate team members in many of these new skills, and/or working with qualified external resources.
Tip 3: Have A Compass. Set a Course. Communicate It Often.
- The primary challenge for your team is not to simply and boldly wade into the data and find something interesting. Rather, team efforts should be aligned with the organization’s goals. This means that you must establish and communicate clear direction and objectives for everyone to deliver on from Day One. Having a compass and setting a well-defined course also help keep your teams from getting caught up in working on secondary or tertiary problems that are interesting, but unlikely to have significant impact on the main goal.
I do wish that data-analysis and reporting unicorns were real! Life would be so much simpler. But they aren’t and never will be, so I let go of that fantasy long ago. You should, too.