Improving Website Performance by 11x

Knowledge / Inspiration

Improving Website Performance by 11x

Continuous Delivery
UXDX EMEA 2022
Slides

Modern websites love JavaScript but it is the number one performance killer today. In this talk, Taylor will go through a case study on improving the performance of the Kroger.com website (one of America's largest supermarket chains) by 11x from an order time of 3 minutes 44 seconds to 20 seconds.

  • The value of performance
  • Javascript budgets
  • Perceived performance
  • Design tradeoffs
  • SPA versus MPA
  • Lessons learned

Hello, this talk is about a drop in the replacement front end of kroger.com where the time from first load checkout was 11 times faster than the existing front end. Here it is on the left to the existing site on the right. These are screen recordings for $40. Android phones overstimulated in-store cellular data. More on that later the front ends are racing to search for eggs, add them to the cart, and then checkout. The 11 times speedup is that whole journey, not just the initial load. My demo was in a prod bucket in front of real APIs through ekkamai over the real Internet and it was also responsive. And in contrast, this is actually one of the flattering recordings of the existing site. I had a lot of takes or interacted too quickly and broke the interface before it was ready. My friend never shipped but got close enough that UXDX thought its story was valuable to learn from. I certainly learned a lot from it. In fact, I learned enough that this talk will have to go quickly. Don't be afraid to use that pause button. website performance is like cars to be 11 times faster you're going to need to optimize many things. Cramming a V 12 in a minivan won't make it race-worthy. And swapping a site's JavaScript framework won't magically make it fast. You need to scrutinize everything from kilobytes hitting the network to pixels hitting the class. The core idea is simple. Don't make humans wait for websites. That's true in all our performances. When you try quantifying that it all goes to hell. You've always seen stats like these. There are hundreds of them in the tech industry. We can't just make software fast because it respects users or makes the world better. We need to prove to managers that performance is another thing that developers want at the expense of making money like refactoring or 100% types or other code virtue signals. This is odd because the math says improving performance is one of the surest returns on investment. A website can get speed is the king feature because it improves all your other features. No feature exists until it finishes loading. At least that's what the math says. But it's such a short investment. How come people are so bad at making fast websites? Why do we keep getting slower and out of proportion to most users' networks and devices? Why do we set new devs up to fail with re-exposed by default when keeping react fast is not a beginner topic? These are simple questions but they don't have simple answers because if they did we'd have sort of solved our performance problems by now. So instead let's answer an easier question. What happened when I tried changing from kroger.com to take seconds instead of minutes? Kroger is the biggest grocery chain you've never heard of because it has a ton of sub-brands. But 17 on the US fortune 500. So kroger.com was in the bigger leagues of web dev where nobody shuts up about scale. It was a wrecked single-page app that was really slow. Still is according to PageSpeed Insights. To be fair to my former employer Walmart's and target sites are also reacting and also really slow. I started learning web development from influences like Rachel Andrew and Jeremy Keith who championed reaching the web's greatest strength. As far as I was concerned an ideal website turns away as few users as possible regardless of their browser device disabilities or network. As you might imagine there was a culture shock when I was hired to work on a giant react single-page app. I frequently disagreed with co-workers. One of those times is when tailwind CSS loader desktop time to first paint by half a second. I complained that kind of impact meant we shouldn't use it. I was countered with a fair question half a second sounded bad boarded actually meant to us. I didn't know. So a co-worker and I found your performance team to find out we'd seen those stats about how fast your site's approved business metrics and we wanted those numbers but for our site, in theory, you can find them by improving performance, and then measuring metrics changes. Fortunately, that is a catch 22 You need a large speed improvement to get statistical significance. But you need statistically significant numbers to get approval to work on large speed improvements. So instead use others' homework to get a ballpark number to substitute your own average order value or whatever then extrapolate. You won't get precise numbers but that's fine.
They constantly change anyway. Our kroger.com estimate was $40000 of yearly revenue per millisecond of load time then the pandemic searched our usage since buying groceries in public was suddenly a bad idea. whatever this number is today it's probably way higher. You should recalculate changes like that. But even then don't consider them the limits of what performance can achieve. Time doesn't have a linear relationship with dollars. The closer the wait time gets to zero the more users change their behaviour. The reason this graph puts quotes around engagement is that your fastest performance timings are the rarest ones as shown by the other line and often from users who aren't really using the product in the first place. Do you know how an empty view loads much faster than a useful one? It may be better to represent that part as a weird quantum range or the data can't predict how performance can improve metrics especially if you change the red boundary of the best possible unshackling speed for users not only changes their behaviour but opens your site up to new worlds. For example, YouTube once shipped a 90% smaller watch page but their average performance metrics tanked. Because users with terrible connections could finally start watching videos. Minor performance improvements pay for themselves. But meaningful speed-ups make your site become orders of magnitude more useful by finding new users that do things you can't anticipate. But Becca Kroger was having a hard enough time with incremental performance improvements. People said they cared. But the performance tickets always got out prioritized speed-ups got hoarded in case they needed to ram a feature through the bundle check. And even the smallest developer experience always seems to outweigh what it costs her users. Maybe proving that speed equalled money wasn't enough. We also had to convince people emotionally to show everyone how much better I could be if it were fast. So in a bit of bad judgment, I vowed to make the fastest possible version of kroger.com. So fast as possible. This is the fastest web page you may not like but this is what peak performance looks like. It's not a useful webpage. But it does show that the fastest you can be as an HTTP response that fits in one round trip. I am to be as fast as how many bytes fit into our responses on the first round trip. I once thought those 14 kilobytes numbers you may also have heard. Turns out the HTTPS handshake takes several round trips by itself so that no longer helps. I needed a more real-world performance budget. Luckily I found the post-Real World Performance budgets from Google's former Chief Performance mug. He advocated picking a target device and network and then finding out how much data they can handle in five seconds. For maximum relevance, I chose Kroger's best-selling phone, the poblano hot pepper. Its specs might have been good ones. My target network was cellular data as filtered through our metal buildings. walking around with a network analyser told me that it resembled webpage tests slow 3g preset. The poblano and the target network specs resulted in a budget of about 150 kilobytes which was not bad. The problem is Cobra dot coms third-party JavaScript total of 367 kilobytes. My boss told me in no uncertain terms which scripts I couldn't get rid of. So after further bargaining and compromises, my front-end code had to fit into 20 kilobytes which is less than half the size of React but the practice is famously small. Why not try that? A quick case summit seemed promising. The react ecosystem had a client-side router and a state manager integration I could use leaving about five kilobytes left over. What is all the JavaScript response needs? Some more code I knew I would need eventually translate UI to JSON and back again something like React helmet but not the Preact helmet package. Because that one is four kilobytes for some reason. This app just updated, please refresh notices. Something like the Webpack module runtime but hopefully not actually the Webpack module runtime re-implementing browser features within the page navigation lifecycle. And if any of you have done analytics responses you know they don't work out of the box. I don't have size estimates for these because I had already abandoned the single-page app approach. Why? I didn't want a Toy site that was fast only because it ignored a real site's responsibilities. I see those responsibilities of grocery Congress's security even over access even over speed and speed even over slickness.
I refuse to compromise on security accessibility. I didn't want any speed-ups that conflicted with them. And the first conflict was security. It's not fundamentally different between multi-page apps and single-page apps both need to protect against known exploits. For multi-page apps. Almost all that code lives on the server. But for single-page apps. For example anti-cross-site request forgery where authenticity tokens are attached to HTTP requests and multi-page apps which means hidden input and form submissions. But single-page apps need additional JavaScript for client-side details not much but repeat for authentication escaping session revocation and other security features. I did have five kilobytes left over so I could spend them on security discouraging but not impossible to work around. Speaking of impossible to work around, unlike security, client-side routing has accessibility problems exclusive to it. First, you must add code to restore the accessibility of built-in patient navigation. Again doable but it means some more JavaScript usually a library but downloaded JavaScript is just the same. Worse, some spa accessibility problems can't be fixed. There's a proposed standard to fix it. So once that's implemented and all assistive software has caught up supporting it won't be a problem anymore. Don't hold your breath. But let's mitigate all that. Sure it sounds difficult but theoretically, it can be done by adding client-side JavaScript. So we're back to my original problem. Beyond the inexorable gravity of the client-side JavaScript single-page apps have other performance downsides. Memory Leaks are inevitable but they rarely matter in multi-page apps and single-page apps one team's leak ruins the rest of the session. JavaScript-initiated requests have lower network priority than requests from links and forums which even affects how the operating system prioritizes your app over other programs. Lastly, server code can be measured, scaled, and optimized until you know it's fast enough for client devices. This is a chart of processing speed across iPhones flagship Androids budget Androids and low-end androids in descending order. The web's diversity makes it fast enough impossible to know for the client side. Devices are unboundedly bad with decade-old chips and miserly RAM and new phones. The poblano was Kroger's bestseller two years ago and it still is even if the two cheaper categories at the bottom get significantly faster. Would you bet against a cheaper slower one from popping up a third time? Trick question. The poblano already scores below that fourth line. All of that was enough for me to abandon a single-page app. And so I do my site to feel clunky and unappealing Today Chrome joined other browsers in 2019 when it should paint holding which eliminates the dreaded white page flash between navigations. I figured if I could send pages quickly interactions would seem smooth and not jarring. I figured if I inline CSS and send HTML as fast as possible there is no overhead that is negligible compared to the network round trip. concatenating strings on a server really shouldn't be the bottleneck. We've been able to do that in under a few milliseconds for 20 years. But there was one problem with quickly generating HTML. Like many large companies, corporate icons and pages were made from multiple data sources which could each have their own team's speed and reliability. If these 10 data sources each take one API call what are the odds my server can respond quickly? Odds are pretty bad. If 1% of all data responses are slower than a page with 10 back and sources will be slow 9.5% of the time and sessions load more than just one view. If a session is eight pages then that original 1% Chance turns into near certainty for every user which is even worse than it sounds. A one-time delay is a cooling effect on the rest of the session even if everything I have to wear does fast. So I needed to prevent individual data sources from delaying the rest of the page. I suspect this problem alone may be why so many big sites choose single-page apps. But this cooling effect also argues that maybe a single-page app's slow load upfront doesn't really fix the problem. But we had fast websites from big companies before we had single-page apps. There's no way this was a new problem. I vaguely remembered early performance pioneers saying browsers can display a page as it was generated but I couldn't remember what that was called. Turns out it's because everyone calls it something different. Regardless of the name. This is a technique Google search and Amazon have used since the 90s and is even more efficient in HTTP two and three so it's here to stay. Now let me show you what HTML streaming is. These pages both show search results in five seconds but they sure don't feel the same. Beyond the obvious perceived improvement. This also lets browsers get a head start on downloading page assets, doesn't block interactivity like hydration does, and doesn't need to block or break when JavaScript does. It's also more efficient for servers to generate. Clearly, I wanted HTML streaming. But how do you do it? Today popular JS frameworks are buzzing about streaming. At the time I could only find older platforms like PHP or rails that mentioned it, none of which were approved technologies at Kroger. Eventually, I found an old GitHub repo comparing templating languages that had two streaming candidates: the double-deprecated dust and something I had never heard of but at least it wasn't dust. Disclaimer. I now work in the Marco team.
But the following predates that. Okay, so Mark who could stream that was a good start. Its client-side component runtime was half my budget but it was JavaScript by default so I didn't have to use it. Where's that streaming though? I buried it in API docs and I found it. code like this results in search results like before it uses HTTPS built-in streaming to send a page in order as the server generates it but await has even more tricks up its sleeve. code like this results in search results like before it uses HTTPS built-in streaming to send a page in order for the server to generate it but await had even more tricks up its sleeve. Let's say fishing recommended products are usually fast but sometimes it hiccups. If you know how much money those recommendations make you can fine-tune a timeout so that their performance cost never exceeds that revenue. But does the user really have to get nothing if it was unlucky enough to take 51 milliseconds? The client reorders attribute turns in weight into an HTML fragment that doesn't block the rest of the page and can render out of order. This requires JavaScript so you can weigh the trade-offs of using it versus a timeout with no fallback. Client reordering is probably a good idea on a product detail page. But you want it to always work with the user who didn't use a dedicated recommendations page. Await had already sold me but Marco had another killer feature for my goal. Automatic component islanding only the components that actually could dynamically re-render on the site would add their JavaScript to the bundle. Right? I had my goal and my theory and framework design helped me with both. Now I had to write the components, style the design and build the features. Even with a solid technical foundation, I saw the nail in the details. At first, I thought I'd copy the existing site to see why. But that UI in its interactions were designed for different priorities so I had to suck it up and redesign. I know the designer but I had a secret weapon. Users think faster sites are better designed and easier to use. My other secret weapon is that I like CSS. The bonus of me doing both is that I could rapidly weigh the pros and cons to explore alternatives that better compromise user experience and speed or even alternatives that improve both. Now my design priorities from before are still held. This website sells food I will ruthlessly sacrifice to live for access. My Login page on the left is no prize winner but you got to admit it is something over the real one on the right. Another problem was our product carousels didn't fit the poblano screen. And because they took up so much real estate trying to scroll past them would scroll-trap me as a bad Google map embed. And that meant their dimensions took up so much space that it caused juddering from layout cost and GPU pressure. So I went boring and simple relying on a text horizontal nature and some enticing to see more links instead of infinite scrolling. Then an easy choice. No web fonts. They were 14 kilobytes. I couldn't afford them. is not the right choice for all sites. But remember groceries. This industry historically prefers effective typography over beautiful typography.
Unlike my decisions to feel like compromising banding modals and their cousins was a user experience improvement Do you like pop-ups and other annoying friends? They also make less sense on small screens in particular models that take up nearly an entire page anyway so they might as well be their own page. Lastly, they're hard to make accessible. The code required to do so really adds up, check out the size of some popular modules, each reasonable choice for tooltip modals and toasts respectively. The alternatives were that widgets weren't as flashy and were sometimes harder to design. But a great design is all about constraints right? That's how I justified it. This led to a nice payoff. Fast page loads can lead your design to better our existing checkout flow headings expanding accordions inter twinkled air constraints and tricky focus management because of the first two. To avoid all that I broke the checkout into a series of small quick pages. It didn't take long to code and it was surprisingly easy. And the UX results were even better. With paint holding a full page navigation doesn't have to feel heavyweight. I could do all this easily because I was simultaneously the designer and the developer which used to be a lot more common. Web designers were expected to know HTML and CSS intuitive understanding of what's easy versus what's hard and web code can go a long way. So after all that, who wanted to get me to see for yourself? Now is this comparison truly fair? No, because the Android app got to skip its download from the App Store. Amazon also keeps trying to move off jQuery or so I hear. Jokes aside, could my demo have kept this speed in the real world? I think so. It hadn't yet withstood ongoing feature development. But far bigger and more complex software successfully uses regression tracking to withstand that such as web browsers themselves. Later features that could live on other pages wouldn't slow down the one seen here thanks to the nature of multi-page apps. And this recording also likes performance tuning ideally wanted like edge rendering and faster HTTPS. Even if got twice as slow or a $40000 per millisecond figure from before would estimate this kind of speed would equal another $40 million of near revenue assuming an 11 times speed up wouldn't change user behaviour. And you know it would. But enough about me let's talk about you. Do you compete with Amazon? Do you want the web to compete with a native? Do you have the responsibility or business opportunity to serve the most users possible even on horrible networks and devices? At the very least do you like the sound of adding millions of revenue? If you can say yes to any of that then this video should represent something important. Because great performance is exceptional in this industry you can become exceptional with great performance. And if you want to be that exceptional here's how there are only three rules but you could use unravelling. The first aim is to be noticeably faster in the ways that matter to humans. fitting into 20 kilobytes from my first load was just ensuring the rest of my grocery buying flow was set up for success. More code loaded later as it was needed. metrics such as time to first byte and first meaningful paint or for diagnosis, not meaningful payoffs. 20% Sooner timed interactive is just maintenance. If you want the speed I showed under the circumstances I targeted I can guarantee Marcos roll-up integration has it. But your needs may differ. Don't pick the technology until it meets your goals on user hardware. Like it has asked if other technologies can be as fast as what I used. And the bad news is I don't know. The good news is that if you set important goals and follow through with them on real hardware you'll inevitably find technologies that are fast enough. To minimize throwing away work. rough estimates like I've been showing here are useful to avoid dead ends. Looking at the performance characteristics of older well known technologies will also help guide what you can accomplish. We knew how to make fast websites decades ago and many things have changed but just as many things happened you probably wanted my 20-kilobyte limit but it's not as unrealistic as you might think. Modern websites can accomplish a lot and similar amounts. Google suggested the budget for feature phones is only 30 kilobytes for example. In fact, I doubt you'll risk being too ambitious. cars aren't engineered for only dry roads under good conditions. Ambitious performance goals benefit everyone all the time.
Usable speeds in the worst-case scenario translate even faster to more consistent experiences the rest of the time. Remember we're doing this for humans and now is an unprecedented time for low-income connectivity. The cost of getting online used to require stable housing. But nowadays you can be on the real internet with a real browser for almost nothing. Correspondingly smart devices and the internet have become important tools No matter your status. If you were homeless, why wouldn't you want a way to reach an important context, apply for jobs, and talk to other people when you also appreciate at least browsing products online just to cut down on time spent in a grocery store, especially in the midst of a contagious disease you can't afford to catch. That's why it's important to verify you can serve those people by checking your assumptions about the devices they can use. tools like Google Lighthouse are wonderful and you should use them. But if your choices aren't also informed by how hardware represents your users you're just posturing. The poblanos stated that 1.1 gigahertz isn't fast, to begin with, but it gets worse if it ran at that speed all the time. If it will produce the heat of a 40-watt incandescent bulb because phones don't have fans to dissipate that kind of heat instead they throttle themselves and start overheating. This happens with every phone, not just the cheap ones. And that's just the CPU. Your time to respond to user interaction may be fine in the abstract. But how's it feel combined with the delay from a cheap touchscreen? Your company probably doesn't sell phones so you can't do it. I did. If you really don't have any hints for how low your user base can go you can at least check what cheap androids Amazon sells a lot of. Also, the network throttling and lighthouse and browser Dev Tools are better than nothing but they're doomed to be much more optimistic than actual slow networks you'll need something that throttles at the packet level. packet-level throttling is more work to set up but it's the only way to be accurate. Here are some programs that can do it. Worst case scenario there's always some command line program you can search Stack Overflow to figure out. Now if you can avoid third parties then perfect do that it would have made my life 130 kilobytes easier. But absolutism won't help when businesses insist if we don't make third-party scripts our problems they become the user's problem. It's tempting to view third-party scripts as not your department or as a necessary evil you can't fight. But you have to fight to eke out even acceptable performance as evidenced by my devoting 87% of my budget to code that did nothing for users. First, apply engineering principles. Marketing revenue is one thing but decreasing its cost significantly is surprisingly low-hanging fruit. marketing people aren't stupid but they have different incentives. And they're missing the incentive to care about the partnership after the window ends. Write down what all third parties are like on your site and for how long and like all accounting track their actual payouts, not just the projected estimate that they approach you with is going to be as fancy as automated alerts are as simple as a spreadsheet. Even a little bookkeeping can avoid money sinks, for example, tic TOCs affiliate program. There are people with entire jobs dedicated to convincing you that their JavaScript snippet is free money. But remember that advertisers lie for a living loading their proposed JavaScript and the most powerful computer I had access to prove that the juice wasn't worth the squeeze. Tick-tock is an unusual net regard. Third-party code offers massive convenience but in exchange for a lot of risks. With increased legal scrutiny of just how profitable selling other users' data is. Weighing the risk is a skill you may want to practice. The ones you can avoid you can still compromise to find happier mediums between marketing income and user penalties. The image alternative is usually warned as only 90% accurate and the payout is similarly cut. But a single non-blocking HTTP request has almost no overhead compared to yet more JavaScript. A lot of analytics providers have also started offering server-side usage because ad blockers and other user privacy tools have worked their measurements. I've also heard good things about party towns. It wasn't ready at the time but that was two years ago. Worse than the performance penalties of analytics is when their blind spots lead to plans based on a distorted view of the world. Measuring accurate valid data is hard. People get degrees in statistics because statistics are tricky. Much of the field is learning how to avoid lying to ourselves with numbers. When it comes to web performance. I worry that we grapple with one of the strongest cases of survivorship bias. Getting even a rough number of how much market there was in my prototype was bizarrely difficult. It wasn't that I rejected data that didn't support my hypothesis, it was that the data was either untrustworthy or completely absent. Our analytics reported bounce rate for older Android versions varied by up to 10%. Between days that kind of variance isn't just impossible to use it indicated something was fundamentally wrong. Then I tried asking your Dynatrace folks how many HTTP requests never made it to a fully JavaScript-initiated session. That discrepancy could show how many users left before our site got its act together. Unfortunately, a week later they told me Dynatrace intentionally discards information. This sort of thing happened a lot. Now, these instances don't prove that there were a huge number of folks trying to use their slow website and leaving frustrated. But a hypothesis that survived several attempts at refutation is at least an interesting possibility. It's wild to me how many decisions were made using analytic software that doesn't even estimate how much I can't tell you. Is it that hard to believe we might have a blind spot or industry trends rich white and male? So there's already a bias that to compensate for that if our stats come from JavaScript and must download, parse and then upload to record a user it's not surprising we'd miss users who were too slow to serve. As unusual and hardline as the rules of meaningful web performance may seem made.
The strangest part is how normal accomplishing them can be. To make my front end 11 times faster I picked a software platform and then wrote business and UI code inside it that mingled back-end data with links and buttons to make normal web development. I don't even touch the back end. The actual hard part seems to be consistent priorities across the platform, the component code, the design, and the product decisions. As much performance knowledge as this project required I wonder how much more knowledge of a different kind it would take to ship it intact so it can actually improve users' lives. Luckily if you're attending UXDX that kind of knowledge is likely your job. I wish I had more useful advice and processes that produce more performance but that would be a disingenuous drawl I think never shipped. I hadn't found a process that works. Which is right. Since then I can question if my performance is so good. Truthfully, I don't know if any of our proposals got specific rejections even if I knew there was a more important question: if the technology has been here and users are out there what's stopping you?