A Whole Team Approach To Testing In Continuous Delivery: One Tester's Journey

Knowledge / Inspiration

A Whole Team Approach To Testing In Continuous Delivery: One Tester's Journey

Continuous Delivery
UXDX Europe 2020

It’s not uncommon for teams moving towards continuous delivery to face a growing backlog of customer-reported bugs. It’s a challenge to maintain the cadence of frequent deployments. If a team has testers, they’re often expected to continue to do all the testing activities, without any thought as to how those fit into continuous delivery (CD) or continuous deployment (also CD). Teams without testing specialists often struggle with insufficient automated test coverage and inadequate exploratory testing.

In this session, Lisa shares her experiences with a team striving to deploy smaller changes more frequently. Her team’s challenges are common ones. They came up with some innovative experiments to find ways to deploy more frequently with confidence. Lisa will introduce some techniques to consider trying, including:

  • Visualizing deployment pipelines to shorten feedback loops and fit in testing activities
  • Using a test suite canvas to determine the minimum automate test
  • Ways to analyze risks and determine next steps to mitigate them
  • Ways testing specialists contribute to team success

Hi, I'm Lisa Crispin. It's great to be able to join you here at UXDX 2020. Wonderful program, wonderful lineup of speakers and great participants and I'm honored to be here.
I'm going to share a story today of one of my journeys with a team who is trying to get to continuous delivery. And I've worked with quite a few people over the last couple of years to develop some of this material.
I've been lucky to work mostly on small cross functional teams where we all pitched in on all those activities along that DevOps infinite loop for the past 20 years. And I've also been lucky to partner up with Janet Gregory to write some books about testing and Agile Development and also create video courses of live courses that can be offered in person or virtually. I've also done a course on Test Automation,DevOps for Tests Automation University. If you haven't checked out Test Automation University yet it's all free content wonderful stuff.
If you're interested in the course Janet and I provide, you can go to agiletestingfellow.com and check that out.
I work remotely for OutSystems which is my team is based in Portugal. I am in Vermont on my farm with my donkeys. So, in my spare time, I do a lot of driving and hiking with the donkeys.
I'd like to share some of my team's experiences in learning how to build quality into our products and be able to fit all the testing activities and to succeed with delivering small changes to production frequently at a sustainable pace which is really the very definition of continuous delivery. Now, in my experience, you learn more from failure than from success and so today I'm going to talk about a team that I was on a few years ago - that for several years anybody would call this high performing team.
Lots of awesome people on our team. Pairing a hundred percent of the time, doing test driven development, embracing all of the extreme programming practices, automating tests at all levels.
There was really a great culture of quality. Quality was highly valued. Testing was highly valued and as we got the infrastructure in place and we're able to move our application to be hosted in the Cloud and get all our deployment pipelines going in the Cloud, having a blue-green deploy where we could have two production instances and easily flipped between them so that we had more safety nets when we deployed to production.
The managers wanted to start trying to deploy more often at the time. We were deploying just once every two weeks, I think and sometimes that wasn't always going well. For various reasons, it was a really complex code base, a single page application with JavaScript and Ruby On Rails in the background and being able to support lots of concurrent users. There's a lot of business logic in the UI. It was a pretty tough nut for automated regression tests and other things.
Let's try releasing twice a week that we have smaller changes. They'd be less risky. We can easily revert our deploys now and that sounded really good but I was kind of worried given that I knew our history as some of the problems we had but we did forge ahead.
We had literally thousands and thousands of automated regression tests at all levels. However, we still had a checklist of manual release regression checks that we were supposed to do before every deploy-to-production. Of course, the two testers on the team out of 25 or 30 developers always ended up getting stuck with this manual regression checking and I don't know if you've ever gone through a checklist like that but as you're doing it, you start to notice things. "Oh, that doesn't look right now. I have to go look and see if that's already on production or if it's a new bug."
Usually the things we found had nothing to do with the checklist itself but other things that nobody had noticed, and it was a huge time suck. And of course, if we found something, we had to stop everything, get that fixed redeploy, start all over. It was terrible and we had no time for really important human-centric testing activities like exploratory testing. We're really starting to get into trouble and we were, in fact, not deploying every twice.
We were really having problems even deploying once a week or every two weeks still. Kind of a mess. But this is a great team and we were really good at reflecting on our problems together and retrospectives and experimenting with ways to fix those problems.
We decided the biggest problem, because we're having a lot of problems in our release candidates that sometimes they were even getting out of production and that was because we were not doing enough exploratory testing. Having the automated regression checks was nice but it doesn't really test your new changes and obviously, having two testers for that many developers were scaling.
One of the actions that we first did was we tested her and said, "You know what, we're not doing this manual regression checking anymore. That's out." Because nobody else wants to do it. We don't want to do it anymore. Guess what?
After we did that, no problems ever happened that would have been found on that checklist, just saying. The development managers realize how important the exploratory testing was and so they actually added exploratory testing skills to the skills matrix for developers, so that in order to proceed their career path, they had to become competent at different exploratory testing skills and then they ask them as testers to have workshops and teach the whole team including developers, product owners, customer support, designers, everybody how to do exploratory testing.
And also, we started pairing with the developers as they wrote production code and we could have helped them not only with their test design but doing some manual exploratory testing before they decided to check in their code. Another developer and I created a little manual or a little exploratory testing checklist for developers. And again, this is just at the story level. The story level where we're working on a very small story, we're doing a test driven or automating tests for it at the unit level API level, if it was a UI story at the UI level as well, and doing this exploratory testing. The developers started to feel comfortable with these skills and then of course, this is at a very narrow level because our stories only took one or two days to finish. Each Epic or new feature set was a whole lot of stories. What we started doing also was writing exploratory testing charters at the end Epic level or feature level and put those charters in the backlog along with the feature stories. We use Henderson's attempt for testing charters in her awesome book, Explore It. If you haven't read it yet. I would highly recommend it.
We had all these charters in the backlog. Anybody in the team could start pitching in on doing these charters as we had enough stories done to make it possible to test at a more end to end level across the Epic or feature. That was one piece of the puzzle and as we did that, we started seeing these crazy problems that would be found right before we release or after release. Unexpected impacts on other parts of the system that new changes we're making. We started seeing those go down. We saw immediate changes with that. Well, lessons that I like to emphasize that this is something I learned from Janet Gregory. When you have a problem, make it visible. So, that you can talk about it. And I'm going to give a couple of different ways to help talk about fitting testing activities into continuous delivery using some visual models and frameworks. We know, we have lots and lots of types of testing: security, accessibility, performance exploratory, you name it and testing it at different levels as well. So, how do we get all that done if we're trying to release every day or several times a day and I think it helps to start looking at the path to production that your new changes take to see where that testing fits in.
This is a technique I picked up from Abby Bangser and Ashley Hunsberger a couple of years ago and these visuals, back in the days when we could be physically co-located, we could just use index cards on a table or stickies on a wall but this is really easy to do with collaborative tools at Google. Google Jamboards or a Mural or any of these online collaborative tools if your teams are still remote.
So, this is just an example of a visual to start making improvements, start mapping out your pipeline and even if you don't have anything automated yet, your code is still going through a bunch of steps and stages to get to production whether they're automated or not, things have to happen.
Visualizing that is basically like a value stream map. You could just do a value stream map for it like, where are their bottlenecks? Where are things waiting on handoffs? Where are their dependencies? Where do we see things slowing down? How can we start? Step-by-step addressing those things.
In this example, I show them manual steps in yellow because yes, those manual steps or human centered testing. As I like to call them, they're still part of our path to production. They're still part of our deployment pipeline. They're just not automated. We have to take it into account and so what we may be able to do is have feature flags that let us keep phasers hidden in production until we finished the testing and feel confident about it. So, at least feature toggles, we can do dark lunches, progressive roll out, testings and production by turning it on just for ourselves.
There's lots and lots of options here because these types of testing in many domains are still really important. There are a lot of big enterprise companies that because of their industry domain, they still needed to do user acceptance testing and things like accessibility testing. We don't really still have the tools to automate all of that testing, even security testing sometimes definitely exploratory testing, it's going to have a human component to it.
Start laying these out, have a meeting, let's take an hour, get your team together, see what your pipeline looks like. Now, I'm telling you to do this but of course I could never get my teams to get together and do this. I know it's sad but I could do it myself or I could do it with another tester or just one other person on my team. And by doing that, I start gathering a whole lot of questions and they're really good questions and now I can take those questions to my team of, "Hey, this test suite is always flaky." or "Hey, this stage doesn't really need to be in this production deployment pipeline because we've already tested that in another pipeline to a test environment."
I could take all these questions and we did do a lot of improvements and a lot of our problem was our pipeline was too slow. So, we were always looking for ways to speed it up, speed up that feedback loop, be able to deploy to production faster in case we do have a problem and we want to get a hotfix out. Of course, I mentioned flaky automated tests. Think about your own team's automated test suites, if you have some. Our team at that time that I've been talking about, we relied heavily on an automated regression test, performance test and we had thousands of thousands of tests and then more problems with exploratory testing but we were still getting some regression failures and existing functionality in production. New changes were breaking features that customers are already using and we started looking at our tests and we really couldn't trust them. We had a lot of flaky tests. We clearly didn't have the test coverage that we needed.
One of the things that helped me here was using Ashley Hunsberger's test suite canvas which she modeled on Katrina Clokie's canvas and if you haven't read Katrina's book, ‘A Practical Guide To Testing And DevOps’ which is available on Lean Pub - that's another one I highly recommend.
This canvas is just a framework to help your team talk about your automated test quiz or perhaps ones you don't have automated yet and ask important questions about them to make sure that these tests run reliably run fast when there are failures, somebody looks at those failures. Make sure they're test that you can touch and I know you can't probably read this but you can certainly download it from Ashley's Github repo and print it out or not print it out, use it on a stick-on Google Drive or some other online collaborative documents they're talking. Some of my request for this are: ‘what information does this test week provide and to who’, ‘How did they get that information?’, ‘Did they get a message on Slack?’, ‘They got an email?’, ‘How does that work?’, ‘How will we know when a test fails and who's going to take responsibility for making sure that fails or making sure that failure gets addressed?’, ‘Are you doing pairing on test automation?’, ‘Are you doing code reviews on your automated test code?’
A lot of important questions here that we may not think about otherwise. So, again, even if you can't do this, talk about this together with your team, thinking about it on your own or with a subset of your team is still going to bring up important questions that you can address and start step-by-step improving those automated test suites.
Now, I keep mentioning how the team had thousands of automated tests and unit level up to the UI level but we're still having regression failures and we started looking at the UI level regression tests and guess what, they had been created by the developers who were writing the production code and we're doing a great job of writing the production code most of the time but the test code, not so much. I mean, some of these tests didn't actually have assertions.
So, like what were they testing? They were mostly happy paths which in some domains it could be okay to just have happy path tests at the unit level or the UI level because you've covered it so well, lower down. But in this case, we had a lot of complex logic in the UI level. So, we couldn't really test all that lower down. What did we do? We got a cross functional group of people from the team together.
It would've been too much to try to get the whole team together. There were about 10 people. Developers, product owner, designer, manager, testers, customer support.
One of the Senior Developers who knew the front-end architecture drew it on the board and we started identifying what are the risky areas and writing those out in red. Of course, this is something easy to do also on an online collaborative tool and this let us prioritize what we needed to make sure that our code covered.
What absolutely do we need automated regression testing for? And then we could go back to the existing tests, see what was covered already and see what we needed to add and so we made a commitment to start using a new framework for our tests, use a page object pattern to make better UI level tasks and so that could be easier to maintain and so we agreed that as people wrote new code or went back and changed code, we would refactor your old tests or write new tests in a better way. They were more maintainable and more dependable. And again, we started seeing results as we refactor our tests and added new ones. We had visibility into what we needed. Now, there are a lot of techniques to surface risks and assumptions that people are not really thinking about and these are just a few examples of ways you can do it.
I like Mind Maps. I like good old fashioned risk analysis, probability times impact. Restoring that's available online. Now, you can Google that and that's an awesome thing. One thing we use where I work in OutSystems has led that effort to before big efforts we do restore the whole team.
These are really important things to help you think, how are you going to mitigate those risks? Automated tests might be one way? There are a lot of other ways, you just have to plan for that. We can do all this wonderful testing in advance and we should do all the testing we can do in advance but what if we have an unexpected load on the system, somebody accidentally drops a table in production or releases a configuration change and it's not good or we're using some external API that suddenly starts returning errors. Teams now that have complex distributed systems and production, we can't even replicate those in a test environment. We can't test all those scenarios, even if we think of them and we don't think of a lot of them. We can't put logging in an alert if we don't know what's going to happen.
So, what do we do? So, this is where my team, and again, this is three or four years back, we're just starting to learn about new things. And at that time, observability was a pretty new thing and we read this blog post from Cindy Sridharan on Copy Constructs on Twitter and it was like, "Whoa, eye opening." We need to make sure that we capture the information we need because what we ended up having to do all the time is, "Oh, we had a 500 error. Something crashed. We don't know why customers are complaining. But we can't tell in Splunk what happened. Let's add some log data and wait for it to happen again." And so these things ended up taking days or weeks.
It's crazy. But what if we instrumented our code in a really smart way and structured events with high cardinality and high-quality data so that we can go in and ask questions to our system and production about things we didn't expect to happen in advance. I am still just learning about observability. I am no expert but we started to immediately see data. We need more data. We need to be looking at all kinds of data. Even looking at analytics data and a mixed panel was helpful to us. So, we started really paying a lot more attention to what was happening in production and finding ways to be able to assess our work quickly. Building quality is easy for me to say and it really has to be a whole team responsibility.
I think this is really what resonated so much with me when I first joined my first extreme programming team back in 2000, was that it was about the whole team being concerned for quality and testing and involving the customer in that discussion as well and in that responsibility as well. It's really easy for me to say that it's like, "Mom and Apple Pie."
How does it really work? Over the past several years, I've been really interested in unconscious bias for a whole range of reasons and we do have scientific data that shows that companies with more diversity are more innovative and they make more money. I have an unscientific theory that when we have a diverse group of people with different backgrounds, different skills, specialties, different experiences that maybe it helps us offset our unconscious bias. Maybe we have different ones or just by gaining together, we can help each other notice more and so getting this diverse group together saying, "Okay, here's the level of quality we want. We're going to make an absolute commitment to that and whatever gets in our way, we're going to find a way around it." And I found making that commitment, that's the thing you have to do, just waving your hands and saying, "Oh yeah, we want you to qualify." As soon as you run into a roadblock, you're going to say, "Well, let's just go ahead and hack that in and we'll look later at how to automate tests for it or we'll look later at how to instrument that code or whatever it is."
It's really, really important to get your team together. Have these conversations, make that commitment, make it mean something. If you're going to have a whole team responsible for testing and quality, well, what about the people who don't have a lot of testing experience or your testing skills? This is where we can transfer our skills and a lot of different ways. As testers, we need to focus on being consultants, modern testing principles. If you look at modern testing, moderntesting.org. I think Alan Page and Brent Jensen's Monitor Testing Principles.
Janet had been kind of saying this for years too but they've really found a good way to say, "The testers really needed to step up and kind of coach the team, be a consultant for the team." And this is what I've tried to do within the past several years. I can't test everything for the team but I can learn how to test by doing pairing. I really like strong style pairing for this, where we have the driver navigator role and switch that off every few minutes or ramping that up into my programming lab testing. I like the term ensemble programming and testing for that better nowadays. But these are great ways because like with an ensemble, you can get everybody you need. Product owner, designer, tester, customer support, developers, architect, whoever, get those people together working on something and then whenever you have a question, you've probably got the person right there to answer that question right away. Quickest feedback loop that you can have, no waiting around to find, find somebody or get on Slack get him to answer you with this question.
I, actually, one of the ways I use this in this team I've been talking about was as developers, again, developers pairing and as developers got enough four or five stories for a feature together and we're trying to make doing really small learning releases and VPs getting just a thin slice out and getting feedback on it and so when we had a little piece ready, we would get the product owner, the designer, a customer’s work person, a tester and the developers in a room and for 30 minutes we would go through all those stories. We didn't do traditional driver navigations.
We switched on each story and so everybody's making suggestions on what to test and then if we had a question, the person was right there in the room to answer and like, "Oh, is this icon in the right place? Oh, is this the air handling we really want?" The product owner, the designer everybody's there or the customer support person is there to say, "Hey, red flag. There's something here that customers are really not going to like or there's something they really need." And so within 30 minutes we could accept or reject all these stories. That's a great example of the power of color on different types of testing. How do you do this? Well, we have to build relationships and remember we're humans first. It's all about people getting good people and helping each other do good work. Most of us are still on remote and so it's a little harder to do but my team at my current job, we have a WhatsApp channel where we just share what we're doing on vacation or some great meal we just made or some crafty thing we just it's made and it gets us down because it gets us bonding at a more human level. Offer help, ask for help. Asking for help is one of the best ways to make a friend. Trying to connect with people and again, I started this new job back at the end of March, just as everything was in chaos from the pandemic and everybody's working from home and they weren't used to that and so I've taken advantage of every opportunity I can to build relationships, having virtual coffees with people. My team has a virtual lunch every week. We just had a party last week on our monitoring and observability team because we finished our roll-out of the last alarm for the monitoring. We have an alarm party. These things are all really important and when we make these relationships, we can help have a better environment.
We know, again, from Science that psychological safety is a prerequisite to team success and a lot of people think that's just unicorn-land. And I know there are lots of companies that don't have this, the vast majority but there are more and more who do. When we have leaders with a vision who know how to serve and support people, get out of their way when they should let them do their best work which is what Agile has always been about that's when the magic happens. If you've never been on one of these unicorn teams I feel bad and I hope you will be able to be online because I think if you haven't been on one and you don't really know what it feels like and it isn't easy to get there. It's not going to happen overnight. It's going to be a journey of months or years, but if you keep committing as a team, you're going to get there and you have to focus on the quality first doing things the right way. That'll pay off later. You'll be able to go faster later but there's a lot of things she needs to learn like the business domain so that you can always be doing them in the amount you need to do.
I really like to use the principles of Continuous Delivery to guide our work and then I think they're really in line with principles of a lot of other things. Who's a good way to mitigate risks and put more joy into your work. Those are things you can't do overnight. Don't try to boil the ocean, just do one small step at a time. Have your retrospectives, identify your biggest problem, design some small experiment with some way to measure it. That can be done in a short period of time to try to make that problem a little bit better and that's what my team had done and we eventually did get to where we could pretty comfortably deploy code to production twice a week, without any fires in production.
I've given you some visual models to use and a visual conversation framework to use and look for more of that worked for your team and have these conversations. Talk about quality.
I'm always available for questions on Twitter and email. Look forward to maybe getting to talk with some of you and I will make my slides available. I've got some resources that you can use to learn more.
Thank you very much.