Data & Design: Understanding The Relationship To Measure Design Quality

Knowledge / Inspiration

Data & Design: Understanding The Relationship To Measure Design Quality

Product Direction
UXDX USA 2022

Measuring design experience and design quality is not easy, but as design's role becomes more blended with technology and business results, the tactical ability to build design's database of measurability is crucial.
Here, I would like to present practical, data-based frameworks that provide a differentiation in measuring subjective versus objective aspects of user experience at scale. Design leaders must prepare for the next frontier of maturity in scaling design teams, and data is part of that adventure.

Today, I am excited to talk about one of my favourite topics which are showing design's impact, but a little bit about myself before we dive in. My name is Jessa. I'm currently the senior director, and head of Design for All Research, Strategy, and Systems for Capital One's Auto Finance. But today I am presenting a topic on behalf of our field, not on behalf of any company, and because I hopefully want to contribute to a subject that impacts both research and design.

What are we going to learn? Well, we're going to learn a few things. Design and data have an undeniable future together. Automation is going to be part of it, and it includes designing data. We're going to learn that we can measure design quality in objective, quantifiable ways. But it is hard. And I want design leaders to know that we have a responsibility in influencing the data being built in our product lifecycle, and it's going to be a bigger role than we think.

This feels important for several reasons, but I'm going to touch on three here. One, measuring anything is incredibly hard and it's going to demand new types of skills joining our team, design data engineers, insight developers, and business analysts. These are all potentially new types of roles that need to be filled. Technology is only going to get more complex and more integrated, so even measuring the usage and productivity of a design system comes into play and comes into play with this ecosystem. And design's value is in the creativity it unlocks, but it is digitizing faster than we can measure. Our creativity is what makes our roles vital. Measuring anything, though, takes time. It takes the definition of data integration and forethought. But these are things that are often coming in as an afterthought. This leads to designers and design teams scrambling to justify things or holding indirect measures as a proxy for what their teams did. Design is digitizing very, very quickly. And if we ignore the need to automate or measure it at scale, it's going to become robotic.

The key takeaway is I hope you're going to get into this talk. There's a difference between subjective and objective measures of quality. And we often over-index on subjective, but both can be measured. We can apply a formulated framework to help identify the low-hanging fruit that is already within our products and then influence the development process with the right type of experience. Design data is going to get easier when design leaders learn new types of skills, but it's going to be hard.

Before we go any further, you're probably thinking, whoa, but we do track data, our product owner tracks, all kinds of things, market data, downtime, compliance. And this is great. You're probably already tracking product analytics and technical failures, sales, and marketing, but that's not what I'm focusing on today. That's not what we're talking about because we probably don't track design experience with the same rigor. What I'm talking about is design data coming from users interacting with your system, specifically crafting a view that will tell us about the design and experience quality as a by-product of the users interacting with your system as a whole. That's the difference, Because if the data we needed were there and if the data we had today were enough, well, this would be a very interesting talk to give at a conference. But since it's not, if the data we had today were enough, our design teams would be much bigger. Design leaders would not lose headcount every year. Engineering and analysts would be part of a standard design team, and design leadership would be at the highest levels of an organization as a standard of that organization's makeup.

So what's the problem? Let's frame it up. While organizations are pressed to measure the user experience. But it often stops at a superficial level. They're often chasing things that are irrelevant to the actual user experience, or they have a catchall proxy. But we know that experience is so much more. NPS is a great example. It's easy. It relates to design. It can be quantified, but we know that there's a depth to that which is not being captured. So even if there are a few, there's really no systematic place to track them. We miss out on getting on the ground floor of the development process down to the very UI and tagging that needs to happen in the code. The cost is that the device and telemetry instrumentation lack design data standards as part of a required package in the development releases. This is a problem. The consequences are really bigger than we think. I'm going to hit on a few here. One, it was expensive. Gathering design impacts are often reactionary. It happens after development launches and it's really highly manual in nature. This leads to reactive decision-making. And even if a design leader is able to prove the value of measuring design's impact with a single team, the minute there is a change in leadership, you have to re-establish that trust and re-establish that relationship. And I'm not just talking about the cost of development, in a reactionary state, you're never going to have enough researchers. Your budget is never going to be big enough. You're never going to have enough designers, and you're going to have to save your budgets for studies that are deemed as, quote-unquote, truly important insights. And design leaders. Our product and our analysts are probably our biggest allies, but we don't often treat them like it when it comes to measuring things. Often, usability testing or design quality comes off as academic or some sort of intellectual branch of work, something that research is done in the dark. But research is not done in the dark. It's not part of only an elite practice. It should be universal. Co-creation of database measures is a way of sharing the burden of measuring across disciplines that really complement each other.

And when measuring is not well integrated into the tools and practices and systems, it creates burnout. Your researchers and designers get tired. It means tool support is limited for them to do their job. It creates frustration. Researchers and designers will often find themselves reproving the same point, or the effort of measuring usability might get wasted and gather dust in some sort of drive or a slide deck presentation. And sooner or later, your team decides that they want to go solve a different problem because they can never be done solving this one. This leads to attrition. This leads to apathy. And it leads to burnout. But there are a few challenges just within the field itself. Even outside of design, I want to touch on a few. Why is this such a problem? Well, one of the reasons is that there are few guidelines or consolidated models that standardized definitions that developers can use systematically and consistently. Most frameworks are missing design-centric data requirements anyway, and so they're not well integrated into development practices. So even how usability factors relate to each other is often left to sort of subjective means of understanding or ad hoc guesses. This is a problem. Let's just acknowledge it. And I don't know about you, but I initially chose the creative field because I failed college algebra twice. So this is a new aspect of design. It is a frontier that's relatively unexplored, relatively unstandardized in the design practice, especially when it comes to the development lifecycle.

And while we may not have gotten into this field because of math and calculations, it is in our future. Once upon a time, there was a graphic designer and technical architect. Those were the roles in a design team. Today, as a designer, though, you have a role to play. To be an effective designer, you have a responsibility to acknowledge that you need to understand the data that's used for your system, the data that is in the devices and services, and especially the humanity in how it's used. This is hard. Measuring quality as a context of design use is probably not on top of what design leaders do, but it is in our future because design is an evolving practice and it always has been.

Humans want to measure things because we want to express and verbalize the importance of our experience. Because when you can measure what you're speaking about and express it in numbers, you know something about it. But when you cannot measure it, when you cannot express it in numbers, it may be the beginning of knowledge, but you have scarcely advanced to the stage of science. And what I hope this inspires and what I hope you get out of this is a systematic way for seeing how you might approach breaking down the complexity of numbers, forming allies with your product partners, and challenging your company to express with numbers the impact of design.

So let's dive in. When it comes to measuring anything, measuring quality. Quality is not one thing. It's nuanced, but there is a difference between subjective and objective measures of quality. Both can be measured with the same category lenses. We're going to break those down. For the purposes of the talk today, I want to focus on usability, looking at objective ways that we can push standardization, instrumentation, and measuring at the product development level, thus freeing ourselves and our teams up to really take on the complexities in that subjective space. To do this, we decompose usability into three things. Factors, criteria, and metrics. First, you determine the design factors that makeup quality. We're all familiar with the big three. Efficiency, effectiveness, and satisfaction. But over the years, other factors have become dominant in the field. Things like trustworthiness and safety. And especially since the eighties. Additional studies have started to include these factors. For example, safety is one that has had a recent spotlight. Safety and data. Safety and identity. Physical safety. But how do we measure these? And how we quantify design in these is really open in the field. It is fairly new in determining quality. This is the final frontier.

Next. each factor is actually made up, of each criteria is actually made up of factors. This is a subset of information that defines it for example. What is efficiency made of? Well, when things are efficient, they're understandable. They make really good use of your time. Their operations are well performed. The resource is optimized. That's what being efficient actually means. You can break this down. Learnability. Let's use this one. When something is learnable, when it experiences learnable, it is familiar. It is consistent. It provides user guidance. That's what learnable means. So you decompose and measure encompassing criteria for each of those design factors. So things like hint text, microscopy or minimal actions become not just design screens but actual components that you can measure to understand if something is learnable for your user. This second layer. These criteria describe concrete, measurable indicators that you can use.

Definition of these is key. Not just for your design team, but for your product team as well. It is critical that you define what you mean and what you don't mean. Let's use understandability as an example. Here I am saying that understandability means how capable a product or software can convey its purpose and give the user clear assistance. What I am not saying is a self-reported understanding. This means that we can use our design system. We can ask things like, what help texts do we have? Do our components have the right microscopy? Can we measure how many times someone has clicked on a help link? These immediately draw distinctions between a qualitative state of something being understandable where the user is saying, Yes, I understand what you need me to do, and it actually becomes a back-end indicator that your researchers can use before there's actually a problem. It's the red light signal that tells you, look here, there might be an issue.

So as you are, define the design factor and the subset of criteria. You start to create a matrix. You can see at a glance the product characteristics that are important to how you evaluate the interface. You can identify overlapping areas. Now, these are just some of the outlines. Breaking down all ten categories creates about 100 plus combinations, but you can see where there is overlap. What could we measure once that actually encompasses more than one design factor? The idea is the same. You systematically identify the criteria within.

So here's where it gets fun if you think if you think data is fun. So criteria like time and productivity or learning,are made up of something. They don't just exist in a qualitative state. What becomes an abstract concept can then become a logical sequence of operations that is used to quantify a quality attribute. You have to define the numeric value. That's the next step. It might be something like accountable data or a metric. Here's an example. Minimal action. How do you define minimal action, truly? Well, things like task time, task concordance or even completion rate. It tells you if something has taken more or less time. Here's another one, time for example. Time is something you can actually measure. It's probably one of the easiest ones. It's things like time on task or completion rates or throughputs. How long did it take someone to go through a flow? Those metrics come with a numeric value. And if you're measuring time and you know the numeric value or the countable data, then it's just a matter of doing math, like actual literal math using that same example of time. Those metrics have potential formulas that are related to each of them. So you break down your definition into a formula. Remember when I said that we alienate potential allies? Well, this is where your quantitative researchers, your data analyst, your statistician, they're your best friends. They swim and live and breathe in this data every day. So you partner with them because your next question is probably where do we get this sweet, sweet data? Here's the good news. The good news is it exists. It has to be somewhere. Bad news is probably all doesn't exist in the same place. And your product partners may not be aware that it exists. What you need as a baseline. It might be somewhere already, but it might be in a business format or lost maybe within a monthly operational review. Beneath product analytics, though, at its most base layer exists.

There's a realm of possibilities and variables. So why does this matter? Because if you, as the designer, know the insights that tell you the experience quality and you can articulate the specific dimensions of information that you need to your product person, Hey, go find me this. I will create a formula. More than likely, there's some unsung hero on the product or analytics side that is a gatekeeper of data who wants to partner with you. Because I promise you this, your business cares about this as well.

Taking those formulas, the next step is really identifying or defining the product interaction that has the highest probability of extracting some sort of measuring. Here's an example of the exercise that we went through. The trick is auditing your landscape or several if you have several products. The goal is really to understand what data is being captured that will tell you what's missing and that will tell you what it's going to take for a developer to code for the missing metrics. So what started out as sort of like a mad game of find a needle in a burning haystack that somehow also in space turns into a more logical approach of a gap analysis. It might not be as simple as what you're seeing on the screen. And in fact, I can guarantee you it's not as simple, but it gives design leaders, especially those design leaders, kind of at a crucial growth point with their design teams, a compass. It speaks the language of business and the business of design. This allows you to see things like, what can we measure immediately? What can we measure with a little bit of work and what's going to take a significant amount of time and investment? This is your low hanging fruit. This is how you figure it out.

And while this may seem daunting, keep in mind that it's really no different than how design leaders and design teams have broken down the compilation of design systems into digestible qualities for our product teams. We do it all the time, but we use words like atoms and molecules instead of metrics. But it's the same. In a design system which we're familiar with. You have your atoms. And then you create your molecules which create your components, which then you build into patterns. We use patterns to tell us what we don't have to recreate every single time. Use this pattern for this landing page so that we can focus on other work. Well, in design quality measures, it's the same, but you just flip it around. You take your design factors, define the design criteria within those factors, pull out the measurement for that criteria and then find the data. It's just flipped around. So what are the benefits? Why do this?

Well, let's talk about money. It can reduce the cost of usability testing if we provide a consistent basis for understanding and comparing different usability metrics at scale. One of the teams insisted that the product team please develop and track three things time on task, I think completion rate and throughput. But doing this reduced the lead researcher's time to provide a benchmark down from two weeks to half a day. We were able to get answers a lot faster. It complements expert evaluation with objectivity, a moment in time research that's quantitative is really expensive, and researchers really struggle to find sample sizes that are large enough to evaluate their findings. But what if we could get massive amounts of information from the device itself, on the same team, the lead researcher was able to prove that even though the NPS was high, I think it was like in the forties or something, the usability quality was really low. It was so low that it really didn't make logical sense to make a product decision based on NPS, the researcher was able to prove that there was a whole population of people who were not taking that survey and having a very bad time with the product itself. And it lays the groundwork for clearer communication on usability measures between your designers, your design experts, your developers, and your product team. Once this development team had implemented the code for just one of these usability qualities, there was a light bulb moment. The time on task. When we implemented the design changes, the time on task went down from like 3 minutes I think to 30 seconds and the development team could see it.

They didn't need research to come in and do another study and prove it. So without having to be in the room, they were able to look at and understand the usability of a system. So one of the developers said our usability measures have become the basis for how we understand if we've developed what design has given us paraphrasing. But that was the gist of it. And then finally, one of the biggest benefits of democratizing usability measurement is creating a foundation of consistent measures that are part of telemetry and instrumentation. By focusing on data engineering, design, research, and product, you can all gather and rally around a common understanding. You can demystify what we need to talk about. Usability, quality. They have clear connections, but this all needs to be taken with a grain of salt. Context is important. The connection between these things is the context of use. How humans are using this really matters because usability without the context of use is meaningless, and design without the context of humanity is useless. So here's where we're going to end. We're going to end with a reminder of the context of where data and we're doing all of this is most valuable, which is understanding humanity.

As humans, we want to know things. We are drawn toward numbers. We want to quantify our experience. We are also as lazy as I'll get out, myself included. And we tend to fall back on just what exists already. Instead of striking out to find what is new, this is hard. Things that are hard to measure are usually what are worth most measuring, but they usually don't exist yet. All of this is going to be difficult because it's going to mean creating something new and we can't do it alone. New things scare us. But that doesn't mean that it's not valuable. However, I want to give a caution that we are prone to key fallacies as humans when it comes to measuring what is hard to quantify. This is not just in UX design. We're actually going to take a lesson from history here, but there are four main fallacies that we tend to fall back on when it comes to measuring what's difficult.

First, fallacy number one, we measure whatever is easiest and we measure that first. That's fine. That's okay, whatever. But this is where we tend to stop. Like, NPS is easy, it's cheap, it's okay. It works. But how many times have you stopped at NPS? Because that's all you need.

So fallacy number two, we then disregard what isn't easy to measure or we give it a random quantitative value. This is where it gets kind of dark. When companies try to measure things that are hard or give them a quantitative random value, it's misleading and it's fake. How many times has a company tried to measure productivity? You see videos of people looking at their computers or people using fans to move their mouse because their computer is being tracked. That's a misleading level of indication of how productive you are because if there's one thing we know about humans, it's that we're incredibly creative when it comes to finding ways to outsmart the system.

Fallacy number three. Next, we assume that anything cannot easily be measured. It's not important. This is blindness. Employee well-being is not easy to measure, but it's important. Safety in the workplace is important, but that's hard to measure, Inclusive and equitable design is definitely hard to measure. But it is important.

And fallacy number four. Finally, we say that if we cannot, something cannot easily be measured. It doesn't exist. But when we say that things do not exist because they're not easy to measure, we're saying that our reality is the only reality. This is suicide. The climate crisis is very hard to measure, but it's real and it's here. Human trafficking is hard to measure. But we should never say that it doesn't exist. The human experience is not easy to measure, but it does exist. And what I hope I've conveyed is a path forward, a first step in breaking down that which is difficult but necessary. And these fallacies are not new. They're actually from history. They're called the McNamara fallacies.

And I'm going to give you a trigger warning for the next slide. We'll be talking about war. Robert McNamara was the US Secretary of Defense during the Vietnam War. And being from Ford, he was a numbers guy. The body count was determined to be the number that would, the metrics used to understand if the US was successful or winning. It was the easiest thing to measure. And it was a horrible mistake. Body count doesn't take things into consideration like chaos, destruction, terror, public mood, or even strategic progress. It was just the simplest thing to measure. Measures should reveal what we need to know, not what is easiest.

And measures should reveal what we need to know about design. We don't just need to know NPS. We need to know NPS and accessibility. We don't just need to know sales. We need to know sales and productivity. This ultimately is a reflection of the internal philosophy and design strategy that you hold as a design leader.

We need to measure what is worth understanding. Worth is a balance of value over cost, and too often cost supersedes because it is easy to put a number too. But what are we doing as designers and as design leaders? If we're not building value? And what are we doing? If we're not creating value for our future. I hope this gives you something to consider. Thank you for listening to my talk, and thank you for letting me be here today.

Got a Question?

More like this?

Mon, May 23, 1:00 PM UTC

3 Learnings from KAYAK's Design System
Alkistis Mavroeidi

Alkistis Mavroeidi

UX & Product Design Manager, KAYAK

Tue, Aug 16, 11:00 PM UTC

The Impact of Research Beyond the Report
Kendall Avery

Kendall Avery

Lead Researcher, Rider Experience, Uber