Azure Load Testing - Load test your APIs and Web Applications with ease

Load testing is an import QA step in any application development lifecycle - Azure Load Testing give you a managed service to orchestrate this testing with ease

Listen here

Speaker A:

It.

Speaker B:

Hello and welcome to the let's Talk.

Speaker A:

Azure Podcast with your hosts, Sam Foot and Alan Armstrong.

Speaker B:

If you're new here, we're a pair of Azure and Office 365 focused It security professionals. Each episode, we talk about a specific topic in the space. This week, it's episode 17 of season three. We're going to have a chat around Azure load testing, a managed cloud service which streamlines your load testing processes. Hey, Alan, how's your week been?

Speaker A:

Hey, Sam. It's not been too bad. Busy as always. How about you?

Speaker B:

Yeah, no good. I'm very excited to talk about load testing in Azure. It's something that I've had the pleasure of having to orchestrate in the past. So this tool should help a lot of people out in that regard.

Speaker A:

Cool. I guess let's crack on then. So what is load testing? Why do companies need to perform it?

Speaker B:

Okay, so I think let's talk about the development process. Let's imagine that we're a fictitious company that has a SaaS platform that we develop. And it may be that our platform has loads of users, which is a great thing for the company. And we might have spikes in traffic with time. There might be certain times of the year where we get higher. We might be an ecommerce website. So Christmas or Black Friday is really impactful to the amount of the number of users that are interacting with our service. And the struggle with those types of more consistent load is kind of easier to understand. But what we need to understand there's two things that we need to understand is how our application reacts in those situations. So let's say we go live with our Black Friday deal and it's a really big success, and 10,000 people hit your website, like in one go, your website might be driven by APIs, and those APIs might then start to struggle. We obviously got Azure and the Cloud to help us with resource. So instead of a more traditional approach where we would maybe deploy our own web server and we'd size it to what we think it could handle for our most spiky days, we could potentially ramp up our resources a couple of days ahead of, like a Black Friday deal going live. And then we know that we've got more vertical resource, we've got more processing power, we've got more memory, we've got better networking and things like that. It's not just on the web server side as well. You have loads of other connected systems, caching servers, caching layers, database servers, other APIs you might pull from, maybe there's third party APIs that you integrate with. All of those things might get overworked and overstressed and then cause a degradation in your application's performance. So load testing is, you know, the simple part of it is effectively pretending to throw loads of users over a certain time period at your application. A typical load testing platform would allow you to record your steps like a test plan. So like go to this page, log in, then go and access pages X, Y and Z with some example user data in there. And once that test is performed, you would analyze the results of that. Traditionally that might be an actual application that you run on one of your developers machines and you would run that maybe even locally. And in the past you might have also been able to get away with if you had a web server, you may be able to run it directly on that web server as well or bring up another server to run it against. And that is okay. And that's still possible to do. Azure load testing wraps around Apache JMeter So JMeter, you can install it on any machine that supports Java, I think it's version eight. And you can record your test plan and you can run it against a system. So that is exactly what you're trying to do in that scenario. You're trying to work out how your application is going to react to the infrastructure that it's on, but also the code that you've written as well, which is usually more of the challenge. As user profiles increase, the complexity of the app gets big, large or more complex. You can start to have weird programming performance edge cases that you don't see until you see really high usage like memory leaks, memory the garbage collection of memory, and releasing of memory. You might find that your application is slow to release wired memory and then you can effectively utilize all the memory on your instance and then your instance just completely grinds to a hole. So you would load test ahead of time to try and work out some of those spikes. And then the other side of it is that as you make changes to your application, let's say I ask you Alan, to build a new shopping cart for our ecommerce website and then we might want to try to understand potentially what performance impact that's going to have to maybe our users, but also to our server infrastructure. I'll call it our web infrastructure because it's not just servers, but web infrastructure. Has Alan accidentally caused an unknown performance regression where for some reason he's not scoped a database query correctly or something like that? A really common human mistake that can happen. But how do you catch those regressions before you get to those critical points in your seasons, your seasonality? Because we don't want to find out about that three minutes after the Black Friday deals gone live, do we? We want to try and understand that ahead of time. So as part of your QA process, if you are building I'll call them web systems, so API driven, anything that uses effectively web requests. So API driven experiences more traditional web applications. You want to try and understand ahead of time how that load does impact your environment. Okay, cool.

Speaker A:

Just to sort of summarize, from what I understood of it, we're doing load tests against the application or APIs to understand whether the current, potentially even the current structure or infrastructure can take that load as well as testing. Applications, efficiency, I guess, when it's at a load as well and whether it like you said, with memory and things like that, where it's releasing it in appropriate time, I guess this would kind of from some scenarios as well. You kind of said Black Friday. I'm kind of thinking of things like in the UK, the day when you find out which school your children are going to, and there's a mad rush from everyone in the UK trying to find what school they're going to and that's kind of a once a year thing as well. And maybe even thinking about it potentially, like when a new NFT needs mining, that maybe there's a load test there or something like that. I mean that's going slightly less corporate.

Speaker B:

But yeah, it also doesn't need to be specific time because you could have an application where you've just got like a high number of daily active users, you might just have a really popular service or conversely you might be trying to run something on really small SKUs to save money. So if you also think about it, like if you think about burstable instances that have CPU credits that you burn and you regenerate over a given time, it might be that at 50 daily active users, it might not be spiky and it might be absolutely fine at 50 daily active users. But once you get to 75 daily active users, you might start to see your credits decrease and basically go to zero. You could exhaust all your credits and what happens at that point is your instance become it's like one 10th of the performance you get like 0.1 V core or something like that. I can't remember the exact number. So you don't ever want to exhaust your credits but you want Burstable because they're well you shouldn't really be using Burstable in production but startups like science so you might be trying to balance that as well. So it can also be because what we see a lot of is we see a lot of people over provisioning instances in Azure because they're still of the mindset of is. I'm not going to change, I'm not going to scale this instance. But in my past I've worked with startups that are cash strapped. Every pound you can save is the way that they usually start. If we could load test and prove to the business that we could support 20 daily active users with only the lowest D series instance as an example, then that's a net positive for the business and you just scale as you grow.

Speaker A:

So I guess this would also help with testing out your configuration of the virtual machine scale sets. If you're using IAS at that point to build your back end or maybe even front end. So they see that the scaling is doing it quick enough or even making itself smaller within a reasonable time as well.

Speaker B:

Yeah, well, you can effectively target any web system that you've got connectivity to. Right. So it doesn't really matter what the underlying if it's PaaS or IAS, it doesn't really matter. And I just use those two examples because that is where you control the code basically. Because usually these performance regressions happen by humans, right. So that's a lot of the time is incorrectly optimized caching or database calls or other independent systems or inefficient processing and things like that. You can have some systems that run background jobs, could have a background job scheduler that you accidentally start running that background job scheduler on your web servers, not your background server. Right. That could just regress and then you could find that at weird times of the day that could then start to have a performance impact. So it's, it's really starting to, you know, because there's so many variables when you're building software, those regressions could come from anywhere.

Speaker A:

Cool. So do you see organizations do this all the time, do load testing? Or do you think there's reasons for not doing is it too complicated to do too time consuming, things like that?

Speaker B:

I'm going to say the challenge for with any part of the software development lifecycle, it has to be justified as to why it should be included if you're running a really efficient process. I've used the startup example, but even larger organizations have teams inside of them that act like startups bootstrapped, teams inside of really large organizations that are given very little resource to innovate and to build. Because that's a very efficient way to build. Lots of people are trying to build a minimal viable product and iterate from there. So what can sometimes happen is because the orchestration of load testing is a challenge, you can get ad hoc load testing happening, but then you're not getting this continuous load testing that's happening all the time looking for these. So any way that you can accelerate the integration of load testing into the development workflow is going to have a higher likelihood of actually being adopted.

Speaker A:

Cool. Okay, thanks for the overview of the sort of load balancing or load testing even sort of process. So can you give us a high level of or a high level overview of the features of Azure load testing?

Speaker B:

Okay, so as I mentioned previously, the Azure load testing service is a fully managed testing service that is completely managed and operated by Microsoft in Azure on your behalf. So there's two ways to configure it. I'll talk about the quick test way is you can effectively give it a you can effectively give it a URL to load test, and then you you tell it, I won't go into the actual specific settings, but you can tell it how many users you want, what you want, the ramp up period, how long you want to test for those sort of metrics. Like I want 50 users to hit this over an X period of time. Basically I want it to ramp up like this. I don't want do I want all 50 to arrive at once or do I want them to come in every 5 seconds, really dependent on your use case. And really simply you can just give it a single URL you can't chain together, but you can have a test for each URL if you wanted to. So you can have a test for each key API call in your application. And that's a very simple it will go and do a URL like point test and it'll give you the response times and latency figures for that specific URL, taking that one step further. And this is how it sort of integrates with an existing product. JMeter apache JMeter basically you create a script and out the back end of JMeter, you create that script and then you can upload it into the load testing service and then it will run through that script effectively gives you I'm really simplifying it now, but it gives you the steps that it's going to go through. So I'm going to hit web page one, web page two API, call three. API, call four. And it's going to go through those steps, and it's going to measure the response time and the latency for each one of those for you in JMeter, you can also bring in CSV files with extra data to parameterize your load testing. So you can bring in different data to populate with different calls and things like that. You can upload those CSV files in there and reference them. You can upload plugins. There's JMeter plugins which are jar binary, binary files so they're like little Java applets basically, that you can plug in that can be referenced in your scripts. You can upload those as well. So if you've got JMeter plugins that you've developed, you're using somebody else's that's also supported as well. There is Arm, so you can orchestrate, but it's all done in the portal. It's like click click click, like really, you know, simplistic for you there. Then you can layer on. So you upload your script and you get ready to go, sort of thing. You then get to pick what your test criteria is. So you could say something like, I don't want the response time of any of my web calls to be more than a second from my API service. So forget the user's internet connection because this is like as close to the service as you can possibly get, right? We're not talking about transit time to the actual user, but what we're saying is, can the API react or the website web service API, can it react to me in under a second? Because you could say actually for a good experience in my app, my users will wait 3 seconds. So you could actually optimize for kind of a worst case if you wanted to, but if you wanted really high performance, you might say actually well, I want everything to return in 100 milliseconds. And you can do that, and you can put some fancy maths around that, and you can say things like, tell me when the worst one is over that, or average the runs over a certain amount, like take all of the web requests and then just average them out. Get the mean or the median across that number to actually whether it's a pass or a fail. It might be some are 2 seconds, some are half a second, and as long as the general user experience so you can do things like that. And what you can also do is you might need to pass secrets into that script, you might need to pass environment variables, secrets. You can define all of those hard coded in the portal if you want. And you can also integrate it with a key vault so you can give it the Azure load service, a managed identity, and then use that for RBAC on a key vault. And you can bring those secrets in so the people that are operating the load testing service don't need direct access to those secrets if you want to keep that away from them, which is great. You can also test private endpoints with it by attaching it into a VNet. There's a service tag for firewall rules in order to let that in, which is very good to see. There's an RBAC model inside of the Azure load testing service as well, so you can delegate permissions to your team inside of that as well. Everything is Microsoft manage Encrypted throughout the whole process, but you can provide your own customer managed keys as well, if that's a compliance requirement for your industry. And then so the other part that you configure on the actual test itself is you say sort of the amount of engines that you want to run, basically how many threads you want to run at one time. You can basically configure that in a quite a detailed way. So once you've done all that, you're sort of ready to but when I say you've done all that, I have also talked about other things like VNet peering and bringing in secrets and things like that to get a simple JMeter script to run it is a few clicks, it's relatively simple. What you do is you can then run that, let's just say for the moment that you run that manually in the portal. What's great is you have the analytics and the dashboards that come out of the other side of it. So you get the graphs which show you all of the outputs, what's your average response time across, what's your P 95 response time across all of your calls. And you can see that as the tests run through because in theory you could be running hundreds of user tests here. So it might not be until the 40th user that you start to have issues. So you can then start to see how that load affects testing. But what I think is really important is you get the raw output from the requests and the responses, the request that you've sent, so the latency and the response time you get that. But then you can overlay server side metrics on those graphs as well and bring them into the analytics. So you can bring in things like, let's say you're IAS, you could bring in the number of credits you've got left in your burstable instances. You could bring your CPU usage, we've now got preview memory bytes, haven't we, for IAS, right? Things like that. So you can bring those metrics in and overlay it on the same timeline as well. So you could watch your available memory dropping as you add more users. So that then you could say, well, we didn't exhaust all our memory, maybe we didn't exhaust it so we didn't hit garbage collection yet, or how quickly it takes for garbage collection to kick in for net processes to release that memory over time as we've used it. So that is immensely powerful because you're seeing in well, not in real time because you get the test result afterwards, but you can see the timeline of activity which then relates to server resources on the other side of it. And I believe you can pull in basically any metric from Azure Monitor into that over that time period as well. So that's like really powerful. So the other part of that is as well is then you can then compare test runs. So you've got test run A. Let's say you're testing burstable instances so you make that run. Then let's say you flip your IAS to the next burstable SKU, or a completely different SKU, maybe you change it to F series. Whatever you do could then rerun your test and then look at the differences between those two test runs. Did adding more performance to certain parts of your application actually net your users more performance in real world applications? Traditional, if you ran a load testing service locally, you'd get like an output, you might get a CSV and XML file or an HTML report, but then comparing it against each other, that is a lot more challenging to achieve. And the last bit I just want to talk about because I've talked to a long time about features, but I just want to talk about DevOps integration because it's probably one of the key parts to it, being able to add this into your DevOps pipeline for both GitHub and Azure. DevOps pipelines? That's the way you say isn't Azure DevOps pipelines? Yeah, sorry, I just called them pipelines. I just remember the full name. But you can integrate this into your DevOps CI CD process. So you could, in theory, reject a pull request that fails your criteria for your tests in an automated fashion. So you don't have to trigger them manually. You can trigger them on if your pull request deploys to your staging environment or a development environment, bring that up in your process and then test it with Azure Load testing service to make sure that you haven't regressed.

Speaker A:

Wow. I mean, it sounds insane how large this product is. Yeah, it's crazy. And it's good to see you can use things like Key Vault and things like that to keep the secrets. I'm guessing that you're talking about the metrics and things like that and get them from, in effect Insights, I guess from your IAS per se. If it's Azure monitoring VM.

Speaker B:

Metrics VM.

Speaker A:

Insights.

Speaker B:

Yeah.

Speaker A:

Does that mean that to collect the data from the test, does it store it in Azure Monitor? Do you know? Is that how it kind of stores that data or just in the service?

Speaker B:

I'm not sure what the underlying service is that stores it. My thought process is it's Azure Monitor with application insights. A layer on top of a monitor? I'm guessing it's that with a custom solution on the top of it, to be totally honest with you.

Speaker A:

Okay, that's cool. Okay, so you mentioned it utilizes JMeter scripts. Is that another product? Because I've never heard of that.

Speaker B:

Yes. So JMeter is a solution that's part of the Apache Foundation. So it's an open source load testing application. So if you're currently utilizing JMeter, then this service is going to be pretty perfect for you. You export your scripts. You might have to do a bit of configuration to bind to secrets and values and things like that. But effectively you could keep your core JMeter scripts. So Microsoft isn't building, they haven't built a new way to run these tests. So you could still run your Jameter scripts locally if you wanted to, just because if you're a developer and you're running them locally on your machine, you probably know what good looks like on your local machine. If you've got a really powerful machine and you're like, your average response time is 80 milliseconds locally, or maybe even less than that, a couple of milliseconds, and then you make a code change and it's now suddenly 2 seconds, you've potentially regressed. It might not be massively long, but as a percentage, you've regressed like a huge amount. But just to quickly talk about JMeter, so it's a Java app. You can run it in two modes like GUI and CLI mode. GUI is really where you would define your test plans. You can also use a proxy inside of a web browser to actually record steps that you would go through as well. So that's pretty powerful. So you can either do it manually or you can do it through that. And then you would save your JMeter script and then you would run it with the CLI. You don't run it with a Gui, you run it with a CLI. And once you've run it out of the back end of it, you get a CSV, XML or an HTML report that you can then use to show you. You can integrate like a back end listener to get real time results as well. But it's all pretty manual from that perspective. You could also run it yourself. You could build your own CI runner. You could have it on your own infrastructure. You do have to do a bit of configuration for JMeter because it can utilize a lot of memory. So there's like heap allocations that you have to think about and the infrastructure that you run it on. And the Azure loaded testing service just completely takes that away. Basically, you just upload it and it deals with it. It's pretty much as simple as that.

Speaker A:

Basically, yeah. Okay, so running it locally you can do, but maybe if you are testing maybe thousands or hundreds of thousands of users hitting your application or service, maybe that might be slightly difficult to run locally. Unless you got like pools of servers to run the runners and things like that.

Speaker B:

You would have to provision that hardware. Right, yeah. And make sure it's sized correctly with the network that's required for it as well.

Speaker A:

Yeah, absolutely. But Azure testing, load testing is basically doing that for you. When we get onto pricing, it'll be you run it for the test period kind of thing, which probably ties us into what is the pricing, how is it billed?

Speaker B:

Okay, so there's a sort of I don't know what we call it. It's like a base cost. Is that the best way to explain it? You've got like a monthly cost for the service. Didn't we have this last episode with where you had Azure Firewall? Yeah, Azure Firewall is like a standard price per month, isn't it? Just to have it service, I'm going to do dollars because that is a little bit easier for a lot of people to understand. So to provision a load testing resource is $10 per month, which isn't particularly expensive, which is great, and that includes 50 virtual user hours per month. Okay, so let's talk about virtual user hours and what that is. We could potentially run a load test of a number of virtual users, right? Because just testing like one user accessing a system might not be that great. Load testing. Right. So I use the example of 1000 virtual users and let's say over the course of a month we ran, let's say each one of our tests was a minute long, right, for each of those. And then we ran 30 of them. Like, we ran them easy maths. We ran them every day for a month. You would then be charged for 500 virtual user hours because each one of those virtual users would use approximately 30 minutes a month to get you your 500 user hours. So we get 50 as part of the service and then on top of that you can then pay per virtual hour and it's segmented in ten second chunks so they add up all the 10 seconds to work out your virtual user hours. The calculation is number of users that are simulated, the duration of the simulation in minutes down to every 10 seconds. So if your runs were like 1 second each you would lose out for that because the minimum time is 10 seconds. And then we just convert that into hours basically by dividing by 60. So if we just work that out on top of the eight or the $10 per month we get 50, don't we? So for those thousand users if we take off the 50 that we've paid for, which is so 500 -50, takes us down to 450 simple math but I'm doing it in calculator to make it easy, right? 450 and then we divide that well times it by zero point 15 because it's fifteen cents per hour. So that would cost $67. So plus the $10 for the month gets you to $77.5 for a month to simulate 1000 users doing 30 minutes of testing in a month. Okay, but you don't manage the infrastructure. Microsoft does all of that for you and you get all of the benefit of being able to easily do the VNet private endpoints, getting all the metrics out of the back end of it, analyzing those metrics. So it's not just really the actual because in theory you're getting better value and features from the service on top of just actually testing basically as well. So $77 a month to do 1000 users using your application for 30 minutes a month and your low test might not even last anywhere near that long. You might not even run tests every single dime time you do. You might just have a CI CD process that runs once every two weeks to do a load test for you. You might not do it on every pull request if you don't want to. So you can tune it from that perspective. You might say okay, we'll do a benchmark run every two weeks and then we could have a two week regression. But as long as we make sure we run a load test before Black Friday we could be pretty safe in the knowledge that we haven't screwed up and regressed on our performance. So you have complete control on the frequency and the duration that you run those tests under as well.

Speaker A:

It kind of sounds like it's like a no brainer in some sense that if you've got to build compute somewhere to do this testing potentially it could be Azure, it could be IAS. If you're going to run the JMeter locally there to run against it even just your time to set it up and everything is going to cost you more. Than just uploading stuff into the service and let it run. And like you said, it's all the integrations, I think, as well into DevOps, if that's on premise. Not necessarily DevOps on premise, but your load testing solution. It's how you then connect things like that. You're in effect saying this is almost like one click connections kind of thing. I'm sure it's a little bit more than that, but generally it's going to be simplified.

Speaker B:

Yeah. And if you think about the infrastructure that's required as well, and the orchestration of that infrastructure, the management of it taking your focus away from your core efforts of actually developing software, you know, this is, you know, in, in my opinion, if if you're happy with JMeter, you know, and, and that requirement. And also, you know, you might have billing already set up in Azure. It might be easy for you to adopt this system if you are already in the Azure world. It's not necessarily a different contract with another vendor. You could just sort of add it onto your Azure spend. Get it? Because you might be paying pay as you go anyway. Right. So it might be relatively easy for you to get sign off because there's no new agreement that needs to be signed off from the business.

Speaker A:

And actually, just thinking about it, you said about the management of it, you know, this service is, you know, in effect, auto patch, et cetera, things like that. If you think about, if you had a server, wherever it was, maybe it isn't patched all the time, but maybe from a security perspective from the business, it might be required to be up to date.

Speaker B:

You'd need to fend up a server on it, wouldn't you? Or something similar.

Speaker A:

Yeah, exactly. And everything else.

Speaker B:

I mean, it's not $77 a month, but once you add all the different things up. Yeah. And I mean, in some scenarios you might have a really bespoke test that you want to run. This might not work for you, so you might have to go down that route anyway. But as a removal of barrier of entry, it's on a plate for you. Upload your JMeter script, make sure it works, bind it to some secrets and point it at what you want to test and crack on.

Speaker A:

Well, that's another point to it, isn't it? That even building your test load testing environment. This is spin it up, try it, shut down if you need to. Or like you said, you don't have to build that whole infrastructure at all.

Speaker B:

You can have more than one load testing resource having different types of tests. You could have one that like a smaller one that runs every day. And then you could have a more intensive one that runs at the weekends, or once a week, or once every two weeks, or once a month. There's many different it's still got the flexibility there for you to achieve that cool.

Speaker A:

Okay. So we kind of talked about it earlier, and we kind of kind of brought it up as well about the DevOps integration kind of thing. So I guess we kind of talked about it a little bit, but why would we integrate it into that process?

Speaker B:

So if your team is utilizing a DevOps CI CD driven deployment methodology, and that's not the case for all teams, and I'm not going to advocate, I'm not going to talk about the advocation of that system, but if you are utilizing that, then you can integrate this system into GitHub and Azure DevOps pipelines. And effectively what happens is, let's say Alan wants to commit some new code to the main branch. So Alan wants to push some code to production. What you can then do is Allen would open a pull request, and maybe that pull request might be reviewed by me. Sorry, Alan, but I'm going to review a code. So I would go in and have a human test of Alan's code. I'd check it for maybe syntax and best practice and maybe give some input on how it could be structured or changed. But for me to go off and then run up in my environment, do a load test on it, I might just end up not doing that. I might skip that process. But with Azure load test, as soon as Alan says, hey, Sam, here's a pull request, I want to submit some code to production. Azure load test can just spin up once the code is deployed because you would deploy it to your staging environment, maybe, or your development environment, as Alan does it. Then after we've deployed, it might run some checks to make sure it's deployed correctly, and then we might run a load test on it because Alan wants to push to production. And so that process is completely automatic. And then inside of GitHub and Azure DevOps pipelines, we can then see the status of the results. We can see the results that come out of the back of that test so it can tell us, hey, Alan's code. Sorry, Alan, I'm just using an example because I'm the one talking to you, but the response time on Alan's code that he's just written has increased response times by 50%. Maybe that's now in a failing condition. So then maybe you've got to go back to the drawing board a little bit and try to identify what's going on there.

Speaker A:

Okay, great. Yeah, you definitely see that again. It's adding, like, a value of automation in there.

Speaker B:

It's not subjective either, is it? Right. It's completely objective. It's like the response time was the average response time before was 100 milliseconds, and now it's now 500 milliseconds. So you might want to understand that you keep all those results, and then you can go back to them in the future as well to look out where the regression actually happened. So maybe our gate was 1 second 1000 milliseconds. So maybe commit one was 100 milliseconds, the next commit was 500 milliseconds. And then the next commit was 1200 milliseconds, 1.2 seconds. We can show the iteration of that regression over time as well. So then you could go back and go, right, okay, we're going to do a hot fix to commit ABCDEF, which was committed by Sam because Alan bumped it by 100 milliseconds, but then Sam bumped it by 700 milliseconds. So we'll go and prioritize Sam's code fix and try and see if we can optimize that effectively to bring our Azure spend down. Right? Because if you have to vertically scale or even horizontally or vertically scale, you're going to be spending more money. So it's also imperative that the solutions that you build are well architected on top of not just well architected in Azure, I suppose, but well architected from a software perspective, a programming perspective. And this is validation of that.

Speaker A:

Yeah, definitely. And I guess it also gives you some context if you were doing that pull request because I guess it either declines it instantly or at least gives some more context that there's a problem somewhere for you to maybe hunt more in the code to see. Understand why you're getting this.

Speaker B:

Well, you can see in the pull request what changed. Something in the pull request or something in the infrastructure must have changed for that to happen.

Speaker A:

Yeah. Okay, so what's the typical process for identifying bottlenecks?

Speaker B:

Okay, so let's take a really simple example, right? Because there's many different technologies you can bring in to optimize the performance of caching being one of them. Let's say we've got a web application that interfaces with a database, right? So once you log in, it goes to the database to retrieve your user profile. That's the sort of the simplest process that I can think of at the moment. So you log in and you retrieve your user profile. Now let's say that let's say that user profile call. So the call to get the user profile. So you call the API, which then runs a query in the database to return the data. Really simple, no caching involved, just straight to database and back. Probably the most simplest web application. Let's say that user profile is start to become slow. Maybe we've added more data into the user profile. Maybe now we're returning the user's profile picture. Maybe we've expanded it out. We've included all their hobbies and their interests and what they like to eat and all of those different things. And maybe that's starting to slow, that cool down. So we might be trying to identify what part of the stack is causing that issue because it could be network transfer to the user. Let's say you're downloading the photo and the photo is 40 megabytes. That could be a slow transfer down to the front end, couldn't it? It could be the web server. It might have to do some processing on the fly on your profile, which is slowing it down when lots of people access it. Maybe the web server is running out of memory. And then there's also the database layer, because we've got to communicate from the web server to the database layer and then all the way back again to the client. So maybe that query that we run in the database is just too heavy now for the resources that are in the database. So we could run a load test, so we could run 1000 users, add a new user every 5 seconds to go and grab their user profiles, and we could watch it potentially degrade over time. Now, let's say you're in a sticky situation. It's not Friday night, because we don't deploy on Friday nights. Thursday night before Black Friday, and even before that, actually. Let's bring about let's talk about more breast practice. What you might want to say is, if we do have a surge in performance, a surge in traffic, how could we react at that time to that surge in traffic? Could we go from B ones to B four S, for instance, in order to cope with that traffic? Maybe scale up ahead of time? So if I went up a SKU in Azure, would I get twice the performance if I did that? Maybe you're trying to work out that performance characteristic. You could run a load test at B one, run a load test at B two, then B four, and see whether your application changes on your web server. And it might not. It might just be exactly the same. So you might say, okay, for for 1000 users, I don't need to scale my web servers. But you might start to see you might use DTUs for your databases. I think we've done an episode on DTUs and databases. I want to say we've talked about DTUs yet. We have. I'm sure we did at one point. Go back and find that episode. I can't remember what it is. But not you, Alan, whoever's listening. Let's say what you want to do is let's say you're at 100 DTUs. Just as an example. You might run a test to then say, what happens if I upgraded my DTU's to 200 DTUs? Would I get twice as much performance? So you could sort of have that in the back of your you could have that performance characteristic as part of your documentation. Maybe that maybe before Black Friday we go to 400 DTUs and we take a little bit of an extra hit on Cost just to know we've got like a huge amount of headroom just in case marketing and sales do a really good job and get more people than anticipated to our product pages. And we can compare all those load as I've spoken about before, we compare all those load tests together, which really is powerful because you can see those performance characteristics change. And I've seen people blindly vertically scale and even horizontally scale and try and go, oh, how can I make this faster? Because sometimes those performance regressions happen at the worst times and you want to be having that conversation weeks before your event that's coming up, and the load testing service is going to help you to accelerate that process internally.

Speaker A:

Wow, okay, cool. Okay, so we're coming probably to the end of our episode because we got quite a lot, we've covered quite a lot in this or anything else you want to add to it?

Speaker B:

I don't think so. I think if you're part of a development team and you currently use your own sort of hack together load testing service, or you don't use load testing at all at the moment, definitely check out Azure load testing if you've got like a web application because it could really help accelerate that part for you.

Speaker A:

Cool. And I don't think we've done an episode on Azure SQL 46 episodes. We've not done one. We've talked about DTUs. I'm sure we have.

Speaker B:

Sure we have. Yeah, exactly.

Speaker A:

I've just quickly ran through and looked at them.

Speaker B:

Thanks, Alan. Yeah, that's wrapped up for me. Alan, what's the next episode that we're gonna cover?

Speaker A:

So we're gonna talk about windows. Three, six, five So this kind of ties into Azure Vetch desktop. One of the episodes we did, I think it was end of last season, maybe, I can't remember so many.

Speaker B:

So, yeah, we're going to talk about.

Speaker A:

Windows three, six, five we're going to talk about what it is and benefits of it use cases as well as the SKUs for it because there's quite a few sort of business enterprise and this new one that's come out recently frontline. So, yeah, we'll talk about those and sort of the use case as well as some comparison against Azure Vet Desktop, I think, when to use either side.

Speaker B:

Yeah, I know that decision making for organizations can be, I'm going to say, challenging to work out the differences between the two. So, yeah, it'd be very interesting to get your thoughts on that. So if you've enjoyed this episode, please do consider subscribing. If you'd like to listen to more of this sort of content in the future, we have many more topics that we'd like to cover and your listeners and support is what will continue to fuel the podcast going forward.

Speaker A:

We also have the ability for you to give some feedback. Did you enjoy this episode, disgrove our thoughts, or did we miss something? Please use the form in the show notes to our website to send us a message, or you can give us a voice message in the bottom right corner of our light. So we'd love to hear from you.

Speaker B:

Great. Thanks very much, Alan, for your time, and I'll catch you guys in the next one.

Speaker A:

Yeah, thanks everyone.

Speaker B:

Cheers. Bye.