Lessons From A Cellist (Or, What Product People Can Learn From User Research)
Lessons From A Cellist (Or, What Product People Can Learn From User Research)
This talk explores the journey of the Duolingo Stories product journey. With a team under pressure to produce results and an organizational culture that prizes rigorous AB testing, we'll explore how qualitative user research became part of our product development process, how it helped us build a product roadmap, and what we've learned along the way.
Lessons from a Cellist (or, what product people can learn from user Research
I want to tell you a little bit today about what I learned from a Cellist not a linguist, but a Cellist, and how lessons from that Cellist changed the trajectory and the roadmap for a product at Duolingo.
Just give me one moment, I want to get my clicker together.
So who here has used Duolingo?
All right, most of you. For those of you who are not familiar with Duolingo, it's a language learning platform, it's one of the most popular in the world, we have about 300 million active users. And our approach to language learning is that it should be something that's fun, that's fast, and that's easy. And so most things at Duolingo are bite sized. And most things that Duolingo are short and fast. However, I'm going to tell you about a product that kind of bucked that trend, and some of the changes we needed to make along the way.
So first, I'll talk about the problem I faced when I joined Duolingo as a brand new product manager, I'll then talk about the process that we took towards fixing the Duolingo stories product. And then I'll share some learnings that hopefully will apply to all of you and your work and product.
So, to start off the problem. This is what I walked into in June of 2018. And I don't know if you can quite see it. But basically the Duolingo stories product had peaked in about February. And since then, the metrics had been driving down and down and down, and was really bottoming out by about June. And management had a lot of questions for me as someone new to the team, namely, what are you going to do about this?
So first, what are Duolingo stories?
Most of you, if you are familiar with Duolingo, are familiar with the quick bite sized lessons that I mentioned earlier. However, Duolingo stories grew out of the need for what is called Discourse Level Language Instruction. I won't get into the linguistic specifics of it with you, though they're fascinating. Suffice it to say that, "When you use a language, you need to really internalise that language, you need to use it in context, you need to see how all of the different pieces that you're learning come together and how a language exists within the world that users who speak that language, use it." This is something that Duolingo has struggled with, our approach has always been fast, fun, short, quick. And it's really hard to give the context that someone who's new to a language needs to really understand how all of that fits together. And so Duolingo story was born out of the idea that if we could take storytelling, make it fast, make it fun, pair it with native speaker audio that we could get at this idea of context driven learning. This is very important for us as a company that wants to go beyond just initial instruction in a language and instead take users from 'soup to nuts' from start to finish in their language learning journey.
However, the metrics were not looking good. And so as a new product manager to the team, aside from declining DA use, I also had stagnant retention on D1, D7, D14, nothing was looking positive. Further some other additional cultural complexities that Duolingo retained.
So for example, we had a legacy of past experiments. When I joined the team, we were just coming off of a number of different major changes to the product that had just gone nowhere. The results were terrible, the team was demoralised, we just couldn't seem to figure out what we needed to turn this product around. Similarly, the team was really struggling to say, Well, what should we do next? There wasn't a clear direction for the product, there wasn't a roadmap, we really weren't sure where things were going. And that was kind of compounded by the fact that we just didn't have a lot of resources, we were sharing one product designer with a couple other teams, we had a couple great devs, who could only do so much in a day. And then we had a content operations team that as you can imagine it Duolingo where a lot of what we're serving up as content, they were spread very thin. This also bumped up against a bigger problem. And one that is particularly true at Duolingo.... In that we have a very, very, very strong testing culture, we AB test everything, every piece of the experience that you have encountered at Duolingo has been the result of a winning experiment. And on top of that, when you're using Duolingo on any given day, you're subject to dozens and dozens of experiments simultaneously. This is in our DNA. This is the core of what we do; we really believe in A B testing. And that's great, but it also means that you live and die by the A B Test and if our A B tests aren't looking great, people want to take action about it. And management was really, really concerned that we were trying all these different things. They were not working. And they were realistically quite skeptical about what we were doing.
And so we were kind of at this big inflection point where we said, we don't know what we need to do. Things are looking really bad on metrics. And management doesn't really believe in us anymore. What do we do next? And this is what I walked into in 2018.
So, how did we go about solving this?
Honestly, I'm not going to tell you anything that's rocket science. But thinking back to how important A B testing is, what I proposed, sort of bucked the trend at Duolingo. And I had to really convince stakeholders to get on board. And when I show you this next slide, you're going to sort of scratch your head and say, "Why?"
Basically, I said that we don't understand our users. It was very clear to me that we hadn't spent enough time getting to know the people who used this very specific discourse level language learning tool. Yes, Duolingo has interviewed many, many, many of its users over the years, and we kind of had a picture of how they think. But no one had really had the bandwidth or the resources to say, what about the stories users? Are they different? Are they meaningfully different from other Duolingo users? So that was my first step was to say, Okay, let's actually just have a crazy idea, let's talk to our users, let's see what we can learn from them, then, if there are specific things that they call out as being problematic, or challenging or difficult? Well, let's just try to test that with usability testing, let's confirm some of the hypotheses that our users lead us to.
Further, I really want to build a roadmap for the product, and I wanted it to be centered around pain points. So it's hard with a product that was as nascent as Duolingo stories to say, you know, we know exactly where this product needs to be in a year's time, because like I said, we didn't even necessarily understand our users. And so the way I sold this to management was that, you know, I'll take everything we learn, I'll convert them into pain points, and we'll prioritise and work on those pain points. And that'll be our roadmap. And that's the prioritise and execute part that we were really going to have to take what we learned and just crank away at it. That was the vision.
So how did we do this? And how did user interviews play a part in this, we're getting to the Cellist, the most important thing we did was really step back and talk to our users and say, 'We need to understand them in a way that we haven't before'. And so I wanted to interview as many users as possible that I could find, that included active users and churned users. Now I'll stop for a minute and say, churned users are probably the most valuable users you can interview, if you have the opportunity, why is that? They used your product, you made a promise to them with your product. And for whatever reason, you weren't able to keep that promise. And the user said, "I've had enough, I have to go". I was really curious about their experiences. I wanted to really get inside their heads and say, "What did you think you were gonna get? What did you actually get?"
Active Users, on the other hand, are also a great resource. They'll tell you, "Hey, this product, it's working for me. Maybe there's some friction and still into it. I'm still here. What can you do better?" I also wanted to talk about Duolingo specifically, I also really wanted to understand users from all our courses.
So Duolingo has 300 million users, like I mentioned, and we teach a variety of languages, we have over 90 courses available. However, stories just had four courses. We had Spanish, French, Portuguese, and German. But Spanish was by far our largest course, it was going to be much easier to find users in the Spanish course than in our Portuguese course, for example, which had maybe a quarter of the users. And I said, "No, no, we really need to talk to users from across this product, because the motivations that each of these users have is going to be different based on the language." So it took me quite a while to find the right people to talk to and in some ways, I went about it in sort of an unsophisticated method. I placed an ad on Facebook. I had a very elaborate screener, which sort of filtered users by their experience with the product, their ability and language they were trying to learn whether they were actively returned. And honestly, it was a lot of work to find the right folks to interview. Some folks would, you know, fill out the survey, you'd never hear back from them. Other folks would think that stories were a different product altogether, and you'd start talking to them and you'd say, "I think we're done here."
But the effort was really worth it. Even though it took a ton of time, nearly an entire quarter was spent interviewing these users and the most important thing I did was to conduct the interviews in a really open ended way.
One thing that I had observed in a couple of the places I've worked, and what I saw, generally as a trend in products, is that we use user interviews if we do them at all. We use them as a way to validate ideas we already have. We say, I want to test this feature, I want to test this change. How would a user respond to this? What do you as a general user of a consumer product like Duolingo think or feel about this? Do you like it? And that's not a bad way of going about asking questions. However, if you were in the place that I was in, where you didn't even really know, what this product was about, who it was for, how it was working, I was concerned that asking really product specific questions, at least at first, would sort of send me down the wrong path, and would have me think, oh, I need to instead work on, you know, the content, or I need to work on the delivery of these stories, or I need to work on the onboarding. I don't even know what our users are looking for. I don't even know yet what the promise that we think we've made to these users is.
So instead, I really just wanted to get inside the head of our users, I would actually sit down with them. These interviews took hours and hours, and I would sit with users and I would just say, walk me through your day. How does your day start? What does your day look like? Why are you learning German or French? or Spanish? When do you try to learn this language? What is the motivation? What's going on here? What's your routine like, and really get a clearer picture for how our users were not only using Duolingo. But why were they trying to learn a language in the first place? Because ultimately, that's what we want to do, we want to help our users actually learn a language. And then what I wanted to do is say, Okay, after I've collected all of this information, we need to validate it, we need to use usability testing or other formats to sort of confirm or deny the hypotheses that these unique users are sharing. And that's important because.... And this was very important to sell it. Duolingo we really believe in data. And for a long time, I think we were a lot more skeptical of our users, we would say the argument from management would go, you talk to 10 users out of 300 million is not a representative sample. And the plural of data is.... Or the plural of anecdote is not data. And this was really something important for us to kind of push back against and say, 'Yes, we're not going to talk to a representative or a statistically significant sample of our users. But we need to get inside some of these users' minds.' And what's true for these users, is probably true for 85% of our users, we really need to understand them. And it's just we're not going to see it in the testing data. We're going to see it in these interviews. But once we learn something from the users, we can try to replicate or validate that hypothesis, using resources that scale.
So the key takeaway for me, the key learning I got from this is that you need to let your users teach you. As product managers, we really think about optimising and we think about processes. And it's not always the case that we step back and say, "Maybe our users actually know better in this case, maybe our users are the ones who have the answers. Maybe our users are the ones who have a vision for this product that we can build upon." And this is where Amy, a concert Cellist comes in.
So in the course of these interviews, I spoke to dozens and dozens of people. And one of the fun things about doing user interviews is that you learn that people have all kinds of fun jobs, all kinds of crazy careers. And I spoke to one person in particular Amy, who was a concert Cellist. Her full time gig was sitting there, playing the cello concerts in the evening, I guess. And she really wanted to learn German for a few different reasons. She really wanted to learn German, in part because her partner was German speaking. And he had a family that wasn't still in Germany and she liked to go visit. She also was really inspired by German music. She really cared about Brahms and Bach, and she wanted to, you know, get a little bit closer to them. And she felt that German was one way to do that. Sounds good! And she'd been learning German with Duolingo for well over a year. She had gone through the entire German course from start to finish. She had an elaborate system for practicing and reviewing. And Duolingo stories were part of that system. She would sit down with the product each day. She'd work her way through a short story. She would try to memorise the words, learn the details, and this was just part of her day. And I got really curious about this. I was like what is it that draws you to stories, time and time again. Because throughout the interview, she pointed out all of these weaknesses in the product. She was able to say, you know, Duolingo stories don't provide me with enough context to understand the language, or I really struggle with these specific words. I think the difficulty is really great. You know, this is just not working for me. But I come back every single day. And this was really curious. So I said, why is that? What is it about this product? That keeps you coming back for more and more every day? How does this work for you? How does this fit into your journey as someone trying to learn German? And what she said to me kind of blew us away, she gave this perfect metaphor that only a cellist could come up with, which was, 'Duolingo is like practicing scales. Stories are like sight reading.'
Now, I don't know how many of you were in band in high school, this will probably resonate with you, if you were... Basically she was saying that the process of going through and learning individual words or individual verbs, the bite sized approach that Duolingo has taken to language instruction was her starting point, it was like learning the notes on the cello, learning the scales, learning how you move methodically, through kind of the mechanics of a language. But stories for her was like sight reading, it was taking all of these different disparate pieces of information that she had learned in a concrete system. And now suddenly throwing them out into the wild and saying, here's a little bit of support, here are the notes for you to read. But you have to actually go about trying to make sense of all of this, you have to move through this experience using the German that you've learned in ways that you didn't expect and that you can't anticipate. And she saw tremendous value in this. This was a big moment for us. This might seem very silly or simple to some of you. But in this moment, our understanding of the product sort of crystallised and we said, "Yes, that is what this is, it is the sight reading counterpart to Duolingo scales, this is really important." And it's something that we never would have reached on our own. If we hadn't talked to the Amy's of the world. This metaphor, which became incredibly important to the team, would have never come together, it would not have coalesced. And after talking to Amy, we saw our product in an entirely new way. And suddenly, all of these different pieces of information that we were hearing from our users in interviews, or A B tests, it's hard to make a little bit more sense. And we said, Okay, with this in mind, how do we build the best sight reading tool for a Duolingo user?
What we did, essentially, don't worry about reading the specifics here. But what we did was take all of these pain points and through the frame of sight reading, we were able to prioritise based on impact. If we made a change, what would the likely impact be in the sense of how it would help a user better sight read a new language. And this became our roadmap. We were able to convert all of these different approaches into a plan for several quarters, I think three or four different quarters. And we wouldn't have got there without Amy, we wouldn't have gotten there without this kind of unique, closed experience. It was so illuminating what she shared with us.
So we were able to make a ton of changes to the product. And the specifics are quite unique to Duolingo stories. But basically, it broke down along with content, and product. On the content side, we realized that we were asking people to sight read music, they weren't really ready for. We were asking people to try to play Bach or Beethoven, what they really needed was like chopsticks, or hot cross buns. And so what we did was really sit down and actually tease apart our content and entirely rebuild it from scratch. We started saying, Okay, what are the scales we've already taught our users at this point in the course. Using those scales, how could we create a story which totally rewired our approach to content production at Duolingo? This was really big before, we had only sort of written a story we had targeted at a specific difficulty level. I'm not sure this being a European audience, you might be more familiar with the CFR, which is the Common European Framework of Reference. It's a system that allows you to sort of categorise ability in a language starting from the most basic A1 to the most difficult C2. If A1 is saying, "Hi, my name is Connor", C2 is probably Barack Obama talking about constitutional law. That's the spectrum.
We had been writing our stories primarily at the B1 level, so quite intermediate. And for our users, that was just not where they were at. They had learned A1 and maybe A2 level scales, and in this case, scales will be verbs and nouns and adjectives. And so we needed to totally rethink what we were doing. And instead, we started looking at one story that was much simpler that a user after just a few lessons with Duolingo could actually understand, or A2, they were a little bit more challenging. And this relates to the aha moment that Amy, the Cellist pointed out to me, for her. This rewarding experience of using a story came, when for the first time you were able to put all the different pieces you had learned together into one kind of unified moment where you finally liked, finished.... I guess Cellist plays like that, where you had finished your first piece of music that you were sight reading, and you said, Wow, I could actually understand a story in a new language. So this was very important.
Similarly, other users pointed out amazing product changes that we were able to execute upon.
For example, we had one user point out, and she said, "I really like what you're doing. I really like the approach to stories. However, I hate that I have to go and look up a new word, every time I don't recognise it, I have to go to a different page, look up the word, find what it means, maybe learn its gender, all this stuff, just it's a pain." And we sort of scratched our heads and said, "Well, actually, did you know that you can do that in the product that if you just hover over a word on the screen, you get an explanation or a definition of that word?" Said, "No", I don't think you guys have that. I never heard about that. And we realised that this key feature that made the product really work, the ability to sort of at any point in time get a translation for words, you didn't know that that piece of functionality, just many users were not aware of it.
So for example, we changed our onboarding flow with a really quick change, which forced a user to reveal a translation hint. It led to I think, 7% improvement to our D1 retention. Simple, but something that we wouldn't have necessarily seen in an A B test, we weren't really aware that there was a problem with our onboarding. And there were many, many changes that came about, were in the same vein. Some of them actually weren't even successful.
So for example, we had users say time and time again, that they didn't want to just see the words, but they wanted to actually be able just to listen to the words. They thought, "You know, in this story, yes, I've read it once. That's great. But now, I want to actually practice my listening comprehension." And we said, "Okay, sure, we'll try that." And so we created a version of the product where all the words were covered. And you could just practice your listening. The experiment failed. And I think the reason is that we were too early, we hadn't yet updated our content. And so we were taking content, that was too difficult. And we were asking users to now do it with their eyes closed, basically. And that didn't really work. And so in this instance, we said, "Okay, looking at our prioritised list of things to change, it's not that this failed, it's that it failed right now. And we said we'd come back, and we would revisit it. I think at some point; the team will do just that.
We also just gave the entire product a big overhaul, I mean, the design really, really changed as a result of talking to users. This product today looks nothing like what it did in 2018. It's night and day. And the results have really made us proud of the fact that we spent this time getting to know our users.
So just as a reminder, this is where we are today. And you can see September 2019, things look really, really good. Again, this is what I walked into in June of 2018. Here are things today, and here's specifically where I joined the team. So you can see that things have really skyrocketed. Over time, there has been this compounding effect of the different changes we have made to the product that have really driven remarkable growth and have saved the team from what looked like certain demise.
So what did we learn here? What are some of the takeaways and the TL; DR is that:
First, it's paramount to invest in understanding your users. This is something that I think especially at smaller firms, or firms that are resource constrained, it can feel challenging to say we're going to spend weeks and weeks doing this qualitative, fuzzy, funky research with our users, we're going to spend days just getting to know them one by one by one. But it's so important, especially if you don't have a clear product market fit yet. It's so important to take that time and get inside your users’ head. Understand, here's what I'm trying to do.
Secondly, that ah-ha moment, you have to know what your user's aha moment is.... For Amy, it was that moment of realising that all these different notes that she was learning all these different nouns and adjectives and verbs that she was learning that they could come together into a story that she could understand. We realised, okay, we need to optimise for that moment. How do we deliver that, whether it's simplifying the content, changing the onboarding, how do we build this experience so that our users will say, "Oh, this was perfect, this was just what I needed?”
Next, you need to be able to prioritise ruthlessly and realistically. There's two parts to that there's understanding, here's what we really need to do. And here's the impact that we need to do it. And then there's the ability to say, "No", to say, "Yes", we can make this change now, but maybe that change has a lower impact. And maybe it's also too early, if we had stuck to our roadmap religiously, I think we would have had even better results sooner, but sometimes got sidetracked.
And then lastly, this is, I think, a very important learning that we've had at Duolingo. In particular, it's that; A B tests are only as good as your product sense.
Meaning: Yes, you can set everything up as an A B test. But if those changes are not based on a deep understanding of your users, and what they need to find value in your product, it's garbage in, and garbage out. And this has had a tremendous impact on how many teams that Duolingo approach their customer research, it's no longer just about rapidly, A B testing as many changes as you can put out in a quarter. And instead it's about saying, "Well, first, do we actually know our users? Do we know them really well? Can we imagine a user and really like to talk to them? And then what might be different things that they need? Let's test those needs first". That's really new for us.
So, with all of that in mind, I thank you for your time, and if there are any questions, I'd be happy to answer them.