Podcast audio transcript

DrupalEasy Podcast S17E2 - Janez Urevc - Gander

[0:00] Music.

[0:05] And welcome back to the drupal Easy Podcast and welcome to season 17 episode number two in today's episode, I'll be talking with Yanez from Tag One consulting. I will let him pronounce his own last name as we talked about Gander, the open source automated performance testing framework for drupal. This episode is sponsored by drupal Career online, drupal Easy's World Class, best practice focused long form drupal training course for beginners. If you or someone you know, is looking to professionalize their drupal skills, definitely check it out at drupal easy.com/dc O the next semester begins September 4th. Welcome Jane to the drupal Easy Podcast. Uh We were just joking a second ago that I am not even going to attempt to say your last name. So go ahead and introduce yourself real quick. Hi. Thank you for having me. I'm Jane and I also know as slash RSM on drupal.org. Oh, thank you. I was gonna, I was gonna introduce you by your drupal.org user name as well. So you've been knocking around the drupal community for about as long as I have about 15 years. Yeah, it is like you're currently uh the strategic growth and innovation manager at tag One consulting. So, how did you get into drupal? What, what have you been doing for the past 15 years?

[1:32] Yeah. I got into drupal because, um, when I was still a student I was talking to the head of our local student organization and he was asking me if I know anyone that builds websites and I didn't, but I was studying computer science and I said, sure. Yeah, I can do it for you. And then that didn't happen. But then a few months later another friend calls me and he's like, oh, Rock told me that you are building websites and I'm now working for this tennis tournament and we need a website.

[2:01] And I was like, yeah, sure. And then uh when we signed the contract, I had to figure out how to do it. So, what technology did you ultimately use? Did you go right to drupal or did you build that? That's um I teamed up with another friend and uh we, we went with drupal and I've been doing this pretty much ever since. First, it was more like a hobby or a side thing. And then it grew up into a full career. And then I worked for the biggest Slovenian Daily newspaper where we built a few of their websites on drupal drupal seven at the time. And after that, I joined examiner.com, which used to be the largest drupal website on the internet back then. And they also launched when it was in drupal seven and they launched when drupal seven was still in alpha. I believe I, I joined later, but they were really early in drupal Seven. So where along that journey does the kind of performance optimization stuff come in for you? Um I've always been interested in performance and even at the Slovenian publishing company, I've did quite some performance optimizations because they were quite sensitive about performance. And then especially after joining examiner, like with such a scale as examiner was performance was like the first class citizen on a daily basis.

[3:29] That's where it really started. We're gonna talk about gander here in a minute or so, which is kind of automated performance testing? Is that a good way of? Yeah, it's so we call it, it's automated performance testing for drupal, open source, automated performance testing for drupal, right? Ok. So what did folks use, what did you use? And other folks in your position use to measure performance for drupal sites before Gander.

[3:56] Like what were the common tools? 5, 10 years ago, it was pretty much I think done manually for the most part, like, definitely not automated, definitely not automated. Like sometimes we would use profiler where you would get certain numbers out or things like the develop module would give you some stats and maybe like you could use inspector in or developer tools in your browser to get some metrics, but it was all pretty much manual. All right. So let's talk about Gander or let's kind of get into it a little bit before we do. I do want to mention for folks who have never heard of Gander or maybe I just, you know, heard in passing. What makes Gander pretty cool in my opinion is that it's in order to write a performance test, it's not all that different from writing any other functional javascript php unit class, correct? Or PHP unit test. There are base classes specific to performance testing that are in drupal core. Since what? 10 do 10.2 10.2. Thank you. I think they're actually, are they actually called performance testing or I haven't? Oh, performance data. There's a base class called.

[5:13] Yeah, the performance now we're getting technical but uh performance data class is where the data gets collected. And the base class for the test is called performance test base. Thank you. Yeah, I wanted to mention that upfront because I kind of wanted to take the mystery out of like how do you do any of this stuff? And for folks who already know how to write, you know PHP unit tests, especially functional javascript tests. This is not that big of a leap from there. Exactly. And also infrastructure wise as we will discuss later, there are two ways that are complementary on how you can use this performance tests, but especially the the first simpler way of using it doesn't even require any other infrastructure. So if you already have your infrastructure to run javascript functional tests, whether it's github actions or git lab C I or whatever it may be, it will already be able to run performance tests as well. Right? And I think that that's one of the really cool things about Gander, in my opinion is the fact that.

[6:19] Most people who already are writing functional javascript tests probably already know, you know, 90% of what they need to know in order to write a performance test. Is that fair, you think? Yeah. Yeah, that's completely correct. All right. Now that we kind of set the table for that. Why Gander, where, what, what was the genesis for the idea? Were there are there any other options on the table as far as implementing, you know, an automated performance testing framework for drupal? So kind of take us through the Genesis. Yeah. So it all started about two years ago when we were introduced to Google Chrome team. And one of their goals is to help make internet faster, which is a very mundane goal, right?

[7:07] But they understood very quickly that the biggest bang for the buck will be if they will collaborate with the frameworks, the like C MS S that run the internet, the most obvious thought would be wordpress, which they also work with and they are also building their performance testing framework and we are collaborating and then they also identify drupal. A major C MS and they approached us and we started exploring how we could collaborate. And in the first round, we added lazy loading of images and media embeds into drupal core that was added in 9.5. And there are some things in 10. So now you have lazy loading of images by defaulting core. This is directly came out of this collaboration between T One and Google Chrome team. And then after achieving that, we put a few more ideas to the table and we brainstorm what could be next steps that we could do and we identified automated performance testing as something that could have really big impact. Long term, we understood that the investment would be larger than just like improving image formatter and it would take longer time to see the results. But it's also it, it was identified as something that is quite innovative and also could have a really big impact long term.

[8:36] So that's, that's how the idea about Gander started. And on the other hand, one of the performance engineers and also uh core subsystem maintainer, natanel catch called Catch opened an issue in drupal core issue queue 15 years ago proposing the idea of adding automated performance testing to drupal. And now we finally were able to revive this issue. And it was also catch who led the development and design the architecture of Gander. So it all finally came together after 15 years. That's amazing. You know, I was gonna ask you, you know, I haven't, I shared a rundown with you prior to the recording. I was gonna ask you why automated performance testing, but I'm not even sure we need to answer that question. I think it's probably pretty obvious why.

[9:29] Automated performance testing is better than kind of one offs here and there. You know, it's, it's all about the historical data you can collect and, and watch over time. Is, is there any any other reasons, I mean, non obvious reasons, let's say it also brings standardization to the table performance testing is first before Gander, it was done just occasionally. But every tester was using some other way of testing, like some slightly different steps, a slight, slightly different configuration, whatever it may be. So the results were never really reliable. And now we have standardized tests that always run with the same environment and this brings a reliable results to the table apples to apples, oranges. Exactly.

[10:20] And I like to say we introduced automated testing to drupal. I, I don't even remember how long ago, but it's probably way more than a decade now. I think, imagine developing drupal without automated tests in 2024 it would, it would be a different product. Yeah. And I think that we will come to a similar point with automated testing, especially because we already have a testing culture in the community. All right. So take a step back because I, I'm just thinking about something here. So, if these base classes are already in drupal core and they've been there for quite some time. What is Gander in that case? And like, what actually, physically is not physically but, you know, file wise or tool wise when, when someone says we're using Gander, I mean, that's more than just the base classes. Correct? Yes. O obviously base classes are.

[11:22] Very important part of Gander. But then later, when we will be discussing about two ways of using Gander, the more advanced way is to regular run tests, collect data and then send this data to a dashboard where you can create graphs and compare different branches. And over time. And th this ideally would let you spot any anomalies. Like if you committed something that's really slowed down a part that is being tested, you should see it on the graph and to do that side of things, we use open telemetry and then you could, you could wire any.

[12:05] Dashboard that speaks to open telemetry into this picture. We are using Grafana and we are te one are also hosting Grafana for drupal core, which is at Grafana dot Twan dot IO, I'll have that link in the show notes for everybody as well. So part of G there is also like default configuration for Grafana, default graph configurations. All that is open source, all that is on github. So if you want to start running tests and doing this longer term monitoring, you can use the configuration that is already provided as a starting point or just use it as is. Then again, there is also with, have an D DEV add on that lets you spin the environment to do this with Grafana and everything locally. So it's not just base class, base class is the, the, the, the central part of it all. But there are other things around it that complement it that makes make it possible to do what we're doing and, and we try to make it as easy as possible for people to start.

[13:11] And at the end of the day, Gander sounds way better than saying open source uh automated performance testing framework for drupal over and over again. I think that like naming and and marketing, if you will is important and having a precise short name for something is beneficial. So this is one of the things that confused me and you know, still still does because I haven't actually written a Gander test at this point without using the D DEV. And on any of the Grafana stuff, you can still write tests as we talked about using those base classes. When you do that.

[13:52] Obviously, you're not getting the the data over time, you're just kind of getting a snapshot. But how is that? How do you actually get access to that data? How does that data surface itself when you're writing the test, you, you decide which part of your testing procedure you want to collect performance metrics for? Because there can be steps in your test procedure that are preparation steps and you don't want to, to measure performance data on those steps. So for example, you're setting up content types and all, you know, all that stuff in your setup method. Exactly if you're trying to warm up caches, for example. So if you have a test that should run in warm caches, you do some steps before to warm up caches. And then at the point when you're ready, you put your test steps that you want to measure performance data for into a closure.

[14:43] And then after those steps have run, you get back this performance data object that you mentioned earlier and there you have all the performance data that was collected during those steps. And then you can start checking if the numbers that are coming back are the numbers that you're expecting. There are some metrics that are deterministic that are great for assertions. So for one example would be number of database queries in a standardized environment. It never changes unless you change something in the code, the number of database queries shouldn't change and you can, you can check that like you can do your steps and say, OK, I expect this number of database queries to fire during this steps that I've just taken.

[15:29] And if that changes your test can fail. And this is what we call performance assertions. And another example of deterministic metric would be number and type of cash requests. If you have a page with hot caches and you are not changing anything, you wouldn't expect any cash validations. You wouldn't expect any cash sets to happen, you would only expect cash gets to happen on that page. So this is something that you could test as well. And then we also have some I I like to call them nondeterministic metrics that will always vary a little bit. We have a few core vitals in there. We didn't speak about cove vitals, but this is, these are, these are the main metric that Google Chrome team uses. So that's why it is important for us and them to be at it. But it's also uh great because it gives you an accurate picture of back and front and performance. So we are doing those, we also have time to first bite, for example. And this kind of metrics.

[16:34] No matter how much standardized your environment is, will always vary a little bit. So it's hard to assert those. Like if, if one time you have time to first bite of 350 milliseconds, it could be 320 the next time. So it's hard to assert on a specific number. So we generally recommend to do not use those for assertions to use the ones that are more termin. But if you really wanted to, you could probably a certain range, it could still produce false negatives or false positives. But if you tweak the range depending on the environment where you're, you're, you are running this, then you might be OK doing that. But just be aware, I'm so glad you answer the question that way because I was 80 85% sure that that was the answer based on like the blog posts and stuff that I read. So I really appreciate you answer the question that way and just to summarize so the deterministic values like number of queries and cash sets and cash gets good for assertions.

[17:41] The nondeterministic values are good for kind of the just the long term monitoring. And that's where Grafana comes in. So fantastic. All right. So going back to the deterministic values, the assertions, there's some great code samples. We'll have some links in the uh in the show notes as far as getting started with grain and with, with Gander and just some reality 1012 line code samples that are just really easy to understand super clear, and with assertions for a query count or, or a cache set or cash get counts. So my question is like when writing the test, how do you know what the right answer should be? How do you know what to, when you're writing that assertion, saying, you know, the query count should be 12? Like where do I go to, to, to determine that? Oh, it should be 12, like 12 is the right answer how to. So I think for me that's the missing piece of the process. When writing these tests. It, it really depends. Oh, that's the worst way to start off an answer.

[18:51] Yeah. OK. I, I think that ma like if you are writing a test.

[18:58] To cover some functionality. And it's not about covering a regression that you just fixed. You would generally, I think, write a test and see what the result is in the test and then just put that number in. And then as you are developing your project module, your custom site, whatever it is, yy, your test will fail if this number will change, obviously. And then you go in and check what's changed and then you decide whether this change was desirable or not and then you change the expected numbers based on that. And this is exactly what we've seen happening in drupal core issue. Q when people are writing tests, they write them to pass with the current numbers. And then there was another issue that was touching something that touched some queries and they increased the number of queries and the test failed on that issue. And the person that was working on that other issue was not even aware of gander. And that's how they learned about it. And then it's up to you to decide whether the change was something that you wanted or something that was a potential regression. So yeah, I think that, that's the answer. You just write a test and see what numbers come out and then continue from there.

[20:23] So it sounds like that using Gander for performance testing, it's really all about changes regardless of if you're using something like Graan to looking at changes over long periods of time. But if you're not collecting all that data just by nature of the way the assertions are written, right? Like this, you know, this bit of code should have X number of queries you're looking for. If that test fails, that means that number of queries has changed from, you know, the previous version of the code. So it's kind of like over time with two data points, you know, before and after type of situation. So I think it's kind of important to note that this isn't a tool to figure out the why something bad is happening performance wise. It's just more of a tool to recognize, hey, something has changed and it's probably not good. Is that a good way of looking at it? Yeah. Iii I would agree with that. It definitely won't tell you what is wrong, right? It's not gonna tell you that you need that. You've got some block that has a max age of zero or something. That's, that's, you know. Yeah. Yeah. Like.

[21:42] Of course, no, it won't. It won't point you to that specific block. But if you are testing the page where this block appears and then like having this block at Mac H zero would probably change something in the queries that are being run or in the cash requests that are being issued. So then you would see that something else is going on and, and, and to your point, we at t one also do a lot of projects called performance audits where clients that have performance issues come to us and ask us to look into why the performance problems are happening and to provide recommendations. And we see the same type of problems almost like almost on every auditor.

[22:40] And the two biggest ones are cash misuse and use misuse. And I've been banging my head to the wall for the last two plus years, and I was trying to come up with a solution to, to fix that because if something is appearing over and over again, then it's probably a systemic problem and nothing has been done on on that front yet. But I was thinking like maybe we could have default or like helper tests for gander. That would just make sure that on the page with hot caches, you are not doing anything like you mentioned having a MA zero or you know, doing invalidation on every, page load or something like that. But then again, people would need to know that this is a problem and they would need to use it this like test that we could provide and it could be very simple, like it could be a test where you just.

[23:47] Provide a list of pages and then test figures out how to make the caches hot and check for these things. I think it would be possible to do it generally, but not a lot of setup for the end user. But you would still need to know that it exists and to be actively using it to benefit from it. It's not because I was even thinking like maybe there is better standing of cash system and I was thinking about doing like uh.

[24:15] Like a course about it or a series of videos trying to teach people how caching works in drupal 8910. But it's not that these resources are not out there. There are plenty of blog posts, plenty of sessions from drupal calls that are recorded where this all is explained. So I'm not really sure how, how to approach this to really make an impact. I I almost appreciate the segue because I, I do as part of our long form module development course. I, I spend 2 to 3 weeks on caching towards the end of the course.

[24:50] And it's, I mean, look, it turned out to be not only difficult to learn but also difficult to teach. And from what I see what happens more often than not is you learn a little bit about cashing enough to be dangerous, so to speak.

[25:05] And then you're working on a project and there is a performance issue that's related to caching. And again, in my experience, I see most of the time the solution is not as nuanced as it should be. The solution is that thing is the problem. We got cash the heck out of that thing. And then it's, you know, either it's over cashed or caching is completely turned off, right? It's almost binary at that point rather than, you know, a more nuanced approach to, to set that cash in somewhere with cash tag somewhere intelligently, where's the data stays fresh, but it still has some level of cash. And it's, it's not, I mean, it is not an easy thing to, to teach or learn. I I'm kind of at the point where I catch myself saying a lot of this comes down to experience, right? I can teach you the tools, the concepts how to turn the levers and dials. But you have to have some experience of turning those levers and dials to really get a feel for it. And also thirst for knowledge. I think that in our industry, it's really important to be constantly learning. And this is one of those things where this becomes really important. So if you are working on your project yourself, then this is very important. If you are looking for service providers for that will build something for you.

[26:31] And it's really important to find people that really understand what they're doing. Because if a lot of these problems can be below the radar for a very long time.

[26:42] And if performers neglected long enough, it becomes so intertwined into the project that becomes really hard to fix or it could just be written off as our server slow, right? Yeah. As you know, and, and, but if, if the server is slow, then you can solve this by throwing more muscle. Well, I'm not saying that's always accurate though, you know.

[27:11] Because a lot of these performance problems that are really impactful are not linear type of problems are exponential. And with this kind of problems, you won't solve it by throwing more hardware in it, right? All right. Let me, I want to take a step sideways. Maybe I want to go back to the long term kind of data collection, seeing the trends over time, Grafana and the other, the other tools. So for someone to take advantage of that, that's gonna require a server with the right software that the performance tests are actually sending data to and that data is stored. So that's where an additional cost is gonna come in or you could self hosts it somewhere or as you mentioned, there's ad dev add on. So you can kind of, you know, see that on your local, on DD A works. It's useful but not that useful. So, what's the process? I mean, is that a, to me that seems like that's a sis a mini.

[28:10] Type of situation where we need someone who knows how to set up servers, get software installed and then what we just point the test to that server or how does that work? Yeah, exactly. That's completely accurate.

[28:23] Wow, that's amazing. I got that right. Any part of that, any, any performance test that you use for assertions, you can also pick as a test that, that you will be collecting data and sending to the dashboard. Is that like a flag in the test or something? It's really not like when you run the tests. You have an environment variable with the end point where the data should be sent to. And this is the only difference like if you run PHP unit without, without that environment variable, it will only run tests. If you add this um en environment variable, it will send the data. And then usually you, you could either just run a single test or you have a special group of tests and then you with PHP unit to just run that group. And this, this is the group that is collecting data and sending it to a dashboard. And from there on you need an open telemetry endpoint that receives the data. And then you, you can, you can, you can send it anywhere from there.

[29:27] But in the no standard way you have uh a tempo for traces and promedio for metrics which is then all displayed in Grafana. And you have to set up, as you said, it's more like ac a mini kind of thing. You have to set up all these services to do what they're doing so that your graphs appear wherever you're hosting this, you can self hosts it because these are all open source projects.

[29:54] Which is yeah, it it is some cost, it is some work involved. Grafana also offers Grafana cloud which has a free tier. To be honest, we never really tried to set it up, but I spent some time researching how it works and I think that.

[30:14] It should be the limitations that they have in terms of amount of data should be fine for most of our use cases because graphite is primarily meant to do instrumentation of your life servers. And there you usually collect quite a lot of data. And compared to that use case, we are collecting very little amount of data. So I don't think that any drupal site or drupal module would exceed the limits that they have. I think that the main downside is that the data retention that they have on free T is just 15 days and that there is no, as far as I understand that there is no anonymous access. So like for the Grafana dashboard that we are hosting for drupal core, it's.

[31:03] Perfect perfectly a you're perfectly able to access it anonymously. If you would be using Grafana cloud free tier, I think that only the registered users that have the access would be able to see it. But that might be, that is, I guess, definitely ok for a custom website and might be perfectly ok for a, for a contributed module as well. And then another thing that is a little bit special is that for the metrics to be reliable, long term, it's probably a really good idea to have a dedicated test runner so that you are running on the same hardware all the time because if you will be running your test on different machines on different runs, then the the metrics will vary just, you know, based on how powerful the machine that this specific run is being run on. Well, the deterministic value should be the same. Yeah, but the nondeterministic will definitely change, right. And in, in the dashboard are the non deterministic that are the most valuable. Yeah, that's a lot of the core web vinyl stuff.

[32:16] Correct? Yeah. Yeah. In time to first buy then, right. Those things. Fantastic. Well, this was great Jan. I, I really appreciate your time. I encourage folks if you have any interest in this stuff at all. Go to, I have the URL right here. Tag one consultant.com/gander, all kinds of documentation and links. There's a great video from.

[32:37] Was it a drupal Con where this was introduced? I forget there's a kind of a featured video. Yeah, we, we had a talk of drupal Con Portland this year and we also had, after drupal Con, we had a webinar with drupal Association, which is also recorded and available on youtube. And definitely, I mean, even if you're on the fence, check out the, I believe it's on the drupal.org Docs page which again is linked from that same tag one consultant.com/gander page and just check out the code snippets because if you have any experience writing PHP unit tests, I I guarantee you that.

[33:13] Any kind of fear or, you know, uh fear is not the right word. But once you see that the sample code, you'll be like, oh, I know how to do that. That's pretty easy. So, yeah. Yeah, I agree. And we are, as I said, we're trying to make it as easy as possible. It really is there, there's a lot of round, there's a lot, you know, we've talked for over a half hour here, but really the core of this is just writing those tests and it, it's not, you know, when I saw those, those snippets, I'm like, oh, this is not that out at all. So, yeah, it is. If you're used to writing tests is a, is a non issue. Exactly. It should just come naturally. All right, Jane, thank you very much for your time and yeah, well, uh, let us know if there's any major updates to gander in the future and, you know, we'll keep an eye on things will do. Thank you for having me and yeah, looking forward to talking to you in the future. Hopefully.

[34:11] Thank you for listening to the drupal Easy Podcast. Don't forget to check out all of our long form drupal training courses at drupal easy.com and stay tuned for the next episode of the drupal Easy Podcast. See.

[34:23] Music.

August 21, 2024