Data Driven Product Development Approaches

Knowledge / Inspiration

Data Driven Product Development Approaches

Continuous Delivery
UXDX Community Germany 2020

Developing new systems and features we have to make decisions that might cause the whole project to fail. How can we take these decisions with confidence? How do we plan for related work? How do we measure success?
In this talk I want to give several examples of approaches that we took and reflect on how well they worked given the requirements we faced. I will cover topics such as identifying key metrics, developing towards a vision and decision making based on data.

About data-driven product development. So I will start with a short introduction about myself. As already mentioned, my name is Flo, I'm working as a staff engineer at Fortified dealing with authentication authorisation and user lifecycle.

I'm working as a staff engineer, working with authentication authorisation and user lifecycle. Before we jump into data-driven approaches, I want to outline some key concepts that I personally think are super important to get right.

The first thing is why even data-driven? So did you ever find yourself in a planning session where you've planned certain features or certain systems that are more than like let's say three months ahead? And then ask yourself how did that go, I've been in planning sessions where I've personally planned systems or had to size even stories and create stories that were more than 12 months ahead, and it went horribly. Like once we've reached that state like 12 months ahead, our product looked completely different from what we envisioned it to look like back then. So therefore the question: if you've encountered these situations, did you end up building what your customers really wanted? Or did you end up building what you in the beginning thought would be useful? And if you still think that you've built what your customers really wanted, how did you measure this? Because in my experience it's incredibly hard to measure the customer success, and there's no one size fits it all. It really needs to be from system to system and needs to be defined. And probably as well adjusted on the way as you continue in development.

So let's follow up with a short story. I'm for myself, I'm usually working remotely with a team in Stockholm but I'm flying once a month to Stockholm to see my PT mates, meet them in person, have some one-on-one meetings, build up relationships, and build my network within the company. And there I'm usually flying from Berlin to Stockholm without thinking a lot. I mostly watch Netflix on the planes. So one day I asked myself though, why are the windows round in an airplane? And I did a bit of research and it turns out that in the 1950s there were Comet 1 airplanes which were some of the first passenger jets. And they quite frequently fell apart while being in the air. No one really realized what was the issue and thus they started bending these airplanes. And say ok we just cannot let them fly because they usually have disasters when they fly. So they bent the airplane and after two weeks of investigations, they couldn't find any reason why it was happening. So they said okay maybe we should let them fly again. So they let them fly again and directly after that the first plane fell apart again, directly after the start. So after a while like of course that then led to a permanent banning of them flying. But after a while, they found out that the problems were in the rectangular windows and there were small cracks going from the corners, all across the body of the airplane, and once the pressure increased it fell apart. But the way more important finding and way more important development after they analyzed that was, that they introduced a black box. For all of you who don't know what a black box is, it is a small component that is really hard to destroy, which basically gathers data while the airplane is in the air, while the airplane is operating. And now they can use this data and constantly analyze what is going on while flying. Even analyze that data offline and do constant improvements to the airplanes. And I think we thought that if we would have just started planning an airplane, from the beginning, we would say, Okay, we want to be in the air and these are the exact steps that we need to follow .And then without testing just do it, execute on it. I think we never would have ended where we are today. We would have probably never been as fast, and I would even say a more bold statement and I would even say we would probably still not be in the air if we would not have experimented.

So the next part I want to talk about, importance of a vision. We've already heard it quite sometime today about a vision, about how you need to have a vision. But I want to be a bit more concrete here and say what is a vision? So in my point of view, in order to formulate a vision you should ask yourself what do I see this system doing? Or like this
feature doing, this product doing, whatever you are you're at the moment designing, doing in the future. Let's say in two, three, four years from now. It is not about an exact feature definition, like it can be fluffy or it should even be fluffy. As if you're trying to be too precise, you're gonna fail most likely. We're going to see why in a second. But then once you formulate that vision you usually do that in a group of people who are building a system together or a feature together. You're going to share it with everyone who has a need to know, and you've already got the first round of feedback, and you adjust your vision. So why should we even do this if it's not too precise and it's fluffy? The reason is, it is great for aligning teams. For example, if they would have just been a vision for the jet, for the person to jet off, yeah we should fly. Some people would have maybe envisioned an airplane that just does not transport passengers but rather goods. And others would have maybe envisioned an airplane with I don't know six wings. So aligning these thoughts makes super much sense. Another example that is from our day to day or like for my day to day development is where we had an account activity service. So we basically thought okay what could this system do, and on that I will go later on in more detail because this is exactly one of the examples I'm going later on to discuss in more detail.

So when you are developing with a vision this is where you start, you're somewhere today and you envision a future. What will actually happen is that you will most certainly never end up exactly here, you will start iterating on your problems, you will start working. And you will notice that you either pretty quickly derail from your vision and you will actually end up somewhere here; or you will stick quite a long time and then you will derail from your vision and you will end up somewhere here. But most certainly you will not end up for the vision. And I think this is not the real task of the vision, a vision is to really align teams upfront in what you are envisioning a system to do, and therefore set a rough timeline off of what should be done.

Okay now let's talk about the importance of data. I already mentioned that you're going to iterate, you're going to measure. Very often when I talk to friends in the industry or like to other people in the industry I hear, yeah but I collect data. And my question is most of the time really or are you just having some log file somewhere that no one really analyzes. Like are you just like logging a huge amount of stuff into some file or do you do you have an actual plan? Planning how to analyze the system is in my opinion even more important than the actual functionality. And that immediately leads to the question, what kind of data do you need to collect? I think for one the minimum data for each system needs to be system health, but that just tells you is your system working right. So what you need to think about for each system that you're planning or for each feature, how do I measure customer success? How do I measure customer experience? It is super hard to measure and it's individual to each company or team or service. For example if we imagine a system, this account activity system, let's assume we envision a feature that should send out an email on login to new devices. So if there is some suspicious account activity, then we send out an email. How can we measure the success of this? It is super hard because you are triggering an event, and somehow in other systems there might be an action, meaning a user does not recognize this login so they are opening the password reset flow and reset their password.

So one example, like this is one example how you could measure customer success. For example how long is the time right now between an account, like they happen to login into the account until the user resets a password. Like there is a user initiating a password reset via your flow and it is not some fraud prevention system or anything else that could lead to a password reset. And in that moment what you will end up is you can measure this lead time and that's a customer experience metric for example.

So another big part in this place like data science, I think the last five years in my job, data scientists became more and more important. Also in my understanding that's probably just me growing I don't think that the space changed that much, but my understanding around it I think changed a lot. So in order to start now, you have data, you probably have a goal, now you need to start, or like you have the metrics, now you need to start to formulate a hypothesis. For example we've already set the example with the email and the lead time, we say if we enable these features with sending emails on logins, we will decrease that lead time. And then you are set, you probably should name it in a way that it's measurable. For example right now it's 48 hours or whatever time you are measuring, and in the future we plan to do this below 24 hours. Meaning a 2x improvement, and then you can start developing this and you can start measuring. So really the most important part here is measure, measure, measure and there can't be enough hypotheses around a system. So I think the more you have the more you want to adjust, the better it is. And also of course if you see hypotheses fail or like you fail to verify them, reflect on why and what you can do the next time better.

So now I want to outline two different approaches that we took in our day-to-day life. One is a really self-contained system which I already mentioned, which is this account activity system. That's a system as we own the logins domain, it is only called by our system like we can identify ok what devices have we seen so far? What devices do we already know? What devices are new? So we can define all of this within our systems. There is no one affected around us, except maybe the password reset people as they will maybe get an increased load by this. So we had no clear definition of what it should look like, we basically even started here with a name, where we said ok this could be account activity where we stall activity on the device. And then we started with a vision, where this could go and we envision the system that will send out emails on suspicious activity; we also envisioned this data to be probably available maybe somewhere to the user. And we formulated hypotheses as I've already mentioned.

So what happened was we started then afterwards with formulated high level tasks, like okay the system should have a database, the system should be integrated into our emailing service. Like tasks on this level, not really low-level tasks. And then we just started executing. Like my girlfriend always says the best way of doing something is to simply start doing it. And that's exactly what we did here, within three hours we had a blank service in production without any logic that had 300 requests per seconds. It was not doing anything, but we've already seen ok how many of these requests is this system expecting? So we were already gathering data at that point in time. Then we started to formulate a hypothesis again, and say ok how many of these do we already know? And we formulated our hypotheses and we've slowly added logic here that we needed. For example, we started probably with like usually we start to test features on employees and then this was, for example one of them. And then, in the end, we just measured and adjusted; and adjusted here doesn't just mean adjusted the code, but adjusted the vision, adjusted the high-level tasks that we set. And then as we were three-four weeks in, we've decided ok the system now looks solid to be rolled out to employees. What are the features that we definitely need to have in place before we roll this out to employees? What are features that we need before we roll this out to our first market? And what are features that we need before we roll this out globally? The system itself is still ongoing development today, but it already as mentioned drifted away from the initial vision.

The second approach is a system with external requirements. So the only knowledge that we had was a rough deadline when we need to support this feature and that it will be based on open ID Connect which is a proper call in the identity space. It is basically to exchange identities in between two internet services. For example Google has an open ID connect endpoint, so you can implement it here as an open ID connect client. And then you will know who you call a risk and therefore create an account for that person and let that that account use your platform. The requirements were really fluffy because we only basically knew okay it's going to be open ID Connect. It's going to be based on open ID connect but it's also not really following all the standards and there is still a lot of development going on the other side. And we knew it's a new login mechanism to Spotify.

So what we did is we ask yourself okay where do we see this? What team signs are involved? How do they see this? And then we started with an aligned vision, meaning both probably I don't really remember concrete numbers, but it's probably around 10, 20 teams within Spotify, so a lot of people. So we brought them all together and said where do we envision this system in the future, or this feature in the future? Then we started with clean APIs to our systems which people had already deployed them, so that people can start developing from the other side. And we've measured and adjusted again as always.

So one thing that I missed the whole talk now, is sometimes you have a vision but you don't know where to actually start and this is where discovery work comes in. Usually we try to send one or two engineers doing discovery work. Meaning checking even which teams are involved? What are their requirements? What are their limitations? And then formulate these requirements in an internal request for comments. Meaning an RFC document, and then everybody can share their opinion and their thoughts on a specific implementation. But again here if you already have a vision, then you will naturally reduce the comments on an RFC because most of the teams are already aligned. And okay this is where roughly want to go, this is where this team roughly wants to go, and that's why this system probably looks the way it does.

So we've worked on this, we've measured, we've adjusted, we've adjusted the vision, we've adjusted the discovery work, we probably have done some more discovery work during** **the project as it continued to go. And then we've finally released it and the success story here is that today we can add a new login mechanism to Spotify within a single day. This is just on the backhand side of course, but I think given that you have so many teams involved doing such a big change in one day, that was for me something where I was quite proud of my team.