How It's Tested | Ep. #7, Game Testing with Michael Le Bail of Homa Games
Listen to How It's Tested Ep. #7, Game Testing with Michael Le Bail of Homa Games.
Eden Full-Goh: Hi, Michael. Thanks so much for joining me on the How It's Tested podcast. It's really great to have you here.
Michael Le Bail: Hi. Thanks for having me. Really happy to be here.
Eden: Yeah. I was particularly intrigued by your background because you work at Homa Games where you're now the CTO and what's particularly notable about Homa is you guys have the most downloaded game in the world and, of course, an impressive portfolio of 80+ games. Would love to hear from your perspective of maybe an overview of the company, your team and your role just so our listeners can get familiar, and I would love to dive in and learn more about everything.
Michael: Yeah. For sure. Homa has been quite a long adventure for me. I started back in 2018 where I started out as a Lead Engineer under the previous CTO. Yeah, since I was able and fortunate enough to see it grow from this very small company, hiring only interns, of about maybe 10 people to what it is today with 230 people. So it's been quite a journey.
So just to summarize what Homa does and, indeed, you've said we've had a lot of success this year with the most downloaded game in the first quarter which is an unbelievable achievement for the team. We started out essentially as a game publisher and this is still what we are today and we are also a game studio as well. Essentially, at Homa we build technologies.
This is what we're known for and these technologies allow game developers from around the world internally, externally, to promote their games and make their games reach the top of the charts. Essentially we have technologies that allow developers to analyze the market, see what the competition is doing, analyze trends, things like this. Then afterwards, test their games.
We take a lot of risk actually on our side, and we spend a lot of money just making sure that every developer is allowed to test our games. Usually that costs a bit of money, like $200 for tests and this allows us to get marketability insight on how marketable the game is. So the better the stats, then the more potential the game has, and the more time we want to invest in those.
And so, let's say every month we test about 200 games, something like this, a bit more, and we really want to work on the game developers and games which have the most potential. Here, essentially, we go into another phase which we call Iteration, which is really about making the game slightly better in different aspects in terms of advertising, in terms of monetization, and also the main game itself and the content of it all.
That allows us to push the game to its maximum potential, and this is where we see the ones that will make it to the top charts, or the ones which have a limited potential and essentially have a ceiling at the top of their potential. So here what we'll do is we'll scale little by little. When we were at a few tests, those hundreds of tests that we do and where we spend a few hundred dollars, essentially we test with about 100, 200, 400 users.
Something like this. This allows us to get enough information to figure out the ones which are interesting, but then we scale up little by little, week by week, and there we reach budgets of thousands of dollars, if not eventually millions. This allows us to see, hey, this game does very well in the US, it can scale in other countries like India or Brazil and things like this.
Eventually, we see until where it can go in the top and hopefully, of course, it can be a top hit. This is really done with all our tech that I related to, advertising tech, monetization tech, and our A/B testing tool as well which we can talk about a bit later. So far, at Homa we've built different kinds of games. We're quite known for hyper casual, but we're building a lot of different puzzle games today, even tactical RPGs.
Also, it's quite a diverse and I'm really looking forward to the successes we're going to have in the future. That's the story of Homa, we're just at the beginning though.
Eden: That's awesome. It's really interesting to hear about your process for experimentation and just casting a wide net to then figure out which games are the highest converting, the highest potential. How is your engineering team or product team structured? Is it structured game by game or are teams in pods to work on games? How long is a game in development before it's ready for that initial phase of experimentation or testing with the... You mentioned a core few hundred users initially.
Michael: Yeah. I'll answer the last one first. It can go very quickly. Essentially our fastest launches have been from the first line of code to let's say the top 10 in the charts, on our best ones it's been four weeks in our best cases. Very, very quick. Sometimes we build more quality games where we want to invest a bit more into the content before we actually promote it at scale. But it really depends, our goal is to go as quickly as possible.
There is always a risk, especially in the early days when we weren't as big, to be copied and to have the idea stolen from us. So we really had to be obviously very data driven, but also very fast in execution. That was always a core value that we still have today. In terms of actual structure, it depends on what you're talking about. Basically we have two different structures.
There's one for the tech tools that we can talk about and tech products, and there we're built in different squads. What we call squads, what you call pods. I think there's a guild in some other companies as well. There it's teams of let's say three to five developers working alongside a tech lead, a product designer, product manager, to build the best tools out there and products that can be used either by our internal teams, maybe it's our UA (user acquisition) team who needs very specific tools internally to list a game in the top of the charts.
Or it's going to be maybe game developers who will need tools to, again, analyze the market or do some A/B testing for example. These are the product-tech tools, and there we work in a typical Sprint fashion with two week sprints. Then on the other side you have the game structure which is more on the operational pull of the company, so it's me, I'm more in the tech and products rather than the operations. There it's really game by game.
These can vary, we usually work most of their cases with external studios and these studios can vary widely from two developers to teams of 25 people. It's really a case by case, depending on how big the game is and how much resources are assigned to this game. These are the two structures and then on our side we have usually a publishing manager.
We're going to have a data analyst who's going to work on the game, if the game goes to post publishing phase we assign also some live ops manager who is going to manage the game's life cycle. Let's say if there's a specific campaign to launch for Christmas, for example, then we'll launch some very specific features for then, and that will be taken over by the live ops team.
Then throughout the lifetime of the game, we also have at least one user acquisition manager who is going to manage the advertising of the game and also some monetization manager who's also going to figure out the best strategies to monetize a game, whether it's via ads or enough purchases, for example. Then we also have people, a great creative team, who is just creating ads constantly and just trying to find new ideas on how to make, let's say, the game appealing.
Of course when there's a new feature, new content, that appears in the game and gives the creative team more room to test new ideas. But it's a very interesting process, so let's say most of the ads they're going to create are actually going to fail and they're not going to be very successful. But every now and then you find this win which just changes the game, and we've had a lot of successes here just based sometimes on one ad and just one specific ad which just changes everything.
Somehow it makes the game incredibly marketable, so that's the structure. It's really very different from tech products which are more long term builds versus these game products which can be either very quick or more long term. Just to give you an idea of timelines, I said the shorter ones were about four weeks from development to launch.
And I mentioned a tactical RPG, for example, this is where we're investing heavily for the future of Homa and these are timelines which are more related to two years. So it's really different kinds of projects that are handled by the team.
Eden: Got it. Yeah, that's really interesting to know, that different teams at the company are operating on different engineering timelines, different pace of development. I'm curious for both the tech platform that you built and individual games, you touched on A/B testing earlier which I'd love to spend some time talking about that.
But switching for a second to just talking about general engineering best practices and regression testing, functional testing, integration testing, what are some of the best practices or philosophies that you want your team and the culture to have? I'm curious, is that different on the tech platform team or just individual squads? Curious of your perspective there.
Michael: Yeah. So on that front, and I can talk about the tech products that we have, we follow pretty much the typical pyramid of testing, if you will, that's found, I think, these days in most companies.
We really try to make developers build as much unit tests as possible. We have a few tools which also monitor our coverage just to make sure that new code is always tested as much as possible. I would say that's the base and it's probably the most important.
Super useful, by the way, now there's some new tools like GitHub Copilot that are allowing us to do this more easily today. Then of course we try and build as many integration tests as possible on our servers, just to make sure that our end points are returning the right responses and that everything is functioning long term with no regressions, and of course a few end to end tests and component tests as well which are going to cover a lot of things that have to do with the frontend.
So we have a main web platform, which is called Homa Lab and this is the main platform that's used internally. It has a lot of modules and some people have access to some and not others. It's obviously used externally by hundreds of game developers already that we collaborate with. And so there, until pretty much 2021, essentially beginning of 2021, we were unfortunately testing a lot of things manually and with a lot of unit tests, yes.
Depending on the project. But really we were doing a lot of regression tests manually with a long list of things to test every single product release, so this of course was very restrictive and the tester, whether it was myself for that matter, or some other people. We were always the bottleneck into deploying faster. We've changed, of course, drastically our philosophy when we started hiring some QA engineers and Devinder who joined as a QA engineer in January, 2021, basically changed the game for us. It allowed us to really ramp up our strategy, first of all, and just reduce manual actions.
We're very heavily into automating as much as possible so that we can have the deployment cycles that we have today, which is deploying to production, basically every one to two days and that's really what our goal is. Would you like to know a bit more about the details of maybe the process?
Eden: Certainly. Yeah, that sounds great.
Michael: Yeah. So basically we have a custom Sprint process at the end of the day, and once the code from the developers is done and ready and validated as a PR, we use a tool called Octeto which is a Spanish startup and it allows us to create some virtual environments which is exactly what you need. If it's just a frontend task, maybe it's just a very simple task of adding a button or I don't know what else, but it allows us to create automatically the moment the PR is generated.
It creates an environment which allows anyone to test. Here it would be more first testable manually, a product person, a QA person can just come in and test the feature if it's complex. But more importantly, on this environment we also run some end to end tests and some integration tests there directly so that we can, before even merging any code or any stage environment or production environment, we can already test on that specific branch and PR.
That's been incredibly useful, because we catch basically errors very, very early on. Of course then afterwards we enter a cycle of feedback. Once these things are validated, then they go to stage and in stage we still have some issues here and there, of course, as most companies do. But it turns out that most of the code that's deployed is basically stable and passes already all our tests before we even have to launch anything. In most cases we don't even talk to QA.
We'll go from stage to product. Essentially it's quite a smooth process today. I think the challenge here is more about adding the right end to end tests so that we can avoid any bad regressions in the future. That's really enough for that. The QA team is taking over with the tech leads to make sure that all the essential features in any one of the platforms or module are covered. That's really the details.
The QA team is always pushing for things to be tested as much as possible in the unit testing level, or maybe integration as well. They're really promoting the right way to do things. Then of course we have some crucial end to end tests which allow us to be very confident when we deploy to product every few days. We have a few types of other testing as well, like load testing which allows us to really test our servers and make sure they can handle the millions of requests that come to us every day.
Since we have our own analytics system, you can imagine that since we have so many millions of players around the world using our games, every single action that they do in our game sends an event to our servers so of course the servers need to be very robust. Yeah, load testing is essential. Just starting to go back on the web platform, just to give a few names for the tools. We use mainly Cypress and Cucumber for end to end tests and integration tests.
Then we're starting to use, as I said, GitHub Copilot for writing some unit tests as well. But that's more or less it. We're experimenting with Percy these days just to see and compare different UIs and just see if this could work for us. I think it's too early stage at this point, but it's already working well. Just to go quickly on some other topics, we also have some mobile automated tests as well which run on our different tools that we build on the mobile side.
Essentially there we have a physical machine that's in the office where it's an Android device which allows us to deploy. We deploy automatically all kinds of new versions of SDKs that can be installed on our games in the future. So we first run them through our testing pipeline, our automated pipeline. In there, everything is fully automated where we fetch the new SDK versions, let's say for a Facebook SDK, we want to take the new version, we can fetch it automatically from the stores, put them in the game.
Then here we go through our nightly tests, what we call it. We have tests which run every night just to make sure that the game doesn't crash, ads are displayed, for example, and obvious things can be tested and make sure that all the SDKs are initialized before the game is even sent to the stores, basically. These are the two main in terms of product tech tools, the two main areas of testing that we have.
Eden: That's really impressive. I think, yeah, it's really cool to see you have a lot of best practices that have been really solidified for the infrastructure and everything you have setup for the Homa Lab platform. It sounds like you try to apply as many of these best practices as well to the mobile side of things. One question I have as a follow up there is you mentioned you use Percy, you use Cypress on the website.
What's your take on the mobile side of testing frameworks and solutions? Do you feel like there's as much support on the mobile side and how do you balance making sure that you have coverage on some of the mobile things as well? Because at least in my experience, I've seen sometimes the mobile side tools are just less established, what do you think about that?
Michael: Definitely. I didn't talk about game testing today because that's really a part where we're struggling to automate things because I think the tools are pretty new and it's nothing compared to what we have on the website with tools like Cypress, for sure. We were able to automate a few things with, first of all, automation on the CI side and then the tests that we talked about with checking that pads are working, are displayed well and things like this.
Definitely there's no proper tool that we see, we're experimenting constantly a bit. All our games are in Unity so there are some tools here and there that we are currently experimenting with, but at the end of the day, when we need to test so many different games, and you've mentioned we have 80 games in our portfolio, most of the testing today is still going to be manual.
We're kind of waiting and constantly doing a bit of R&D, testing new tools, just to see what could be used. But for sure, today definitely open to suggestions if you have any. We've struggled on that front for automation.
Eden: I guess theoretically, even if there was a tool that was able to do automation, how would you go through that decision making process of like, okay, you have 80 games, how would you prioritize these are the games that you think are the longest standing, the most users, to figure out... Because you mentioned you have tools that measure code coverage. I don't know if you have other internal metrics that you guys are tracking. Are certain games higher priority to keep in good shape than other games? How do you think about prioritizing those 80 games?
Michael: Yeah, for sure. The priority is relatively simple, it's revenue at the end of the day. So any game that makes more revenue than another is going to be a priority. Usually the games that we recently launched are the ones which are making the most and therefore the highest priority. There's always going to be some cases as well where we see that some games, maybe they're not making the most revenue but they have a high number of crashes, so we use tools like Crashlytics to allow us to identify the issue and why the games crash.
That's quite interesting to investigate all the time, so we try to get to stay, I think, under 5% of crashes, for example. It's always quite a good tool because it really allows us to dig deep into the actual error and really get a sense of where to start the investigation. The priority is relatively simple, so that's why we've been able to manage with a QA team which is still relatively small but growing. We're still doing a lot of manual tests.
Eden: I see. Yeah, I guess that's really interesting to know that you have this QA team, a fairly small team, and they're able to support 80 games. I'm assuming it means you have a schedule or a strategy. How often is the QA team touching each game? Is it a couple times a month, or how does that work?
Michael: When you're in the process of iterating a lot on the game, then there it's basically daily. For these games which are not yet published on the store, or not at scale let's say, these games are basically tested every day because there's a new version, new content, new advertising networks integrated into the games so we need to make sure there's no regressions.
So there, for those new games, constantly. For these old games which have been in our portfolio for a while, maybe in those 80 games, there are a lot of games where we don't touch and they're just there and maybe we can in the future do some automation on some of these games which don't bring in much revenue. But essentially they're not really touched. The focus is really going to be on about 10 games, maybe, where the updates are just coming every week and that's really the priority.
Eden: I see. One of the interesting things that I've fought through about testing games or even trying to build automation test cases, is that I know games are especially... There can be so many different paths that you as a user would go through, do you feel like with most of the games that you would potentially consider automation for, is there a deterministic flow that you could build some basic end to end tests for? Either, yeah, in app purchases or onboarding? Yeah, how do you think about... even if you had automation, would you be able to automate in a deterministic way? How does the role of manual versus automation testing evolve over time in your opinion?
Michael: I think there's two avenues to explore here. One is to do with common components, which are there across the game. So for example, we're releasing our Homa Login and player profile very soon and it's going to be available in all our games and it's going to look the same in all our games as well. That's a huge advantage because we'll be able to automate something which is the same across different games and devices.
That will be a huge plus and this will allow us to start this common automated testing hopefully, with the QA team even on the game side. Anything that's common and we're doing more and more of these things, we're going to add some leaderboards in the near future as well in all our games. These features will be testable, I think, with automation.
The second avenue is really if one of our games is just, let's say, top one priority and really bringing in the majority of the revenue one day because we're really building the company towards building these games which are absolutely massive. There, it will make sense to even invest some time into automation specifically for this game.
It can be reused, maybe anywhere else, but it will ensure that a player can play through the first level of the game and maybe test a few things like this. Some different features, the shops, making sure that IAPs can be bought, ads are displayed. I think all of these things can be done because we're ready to invest the time specifically on one game. I think these are the two things that we need to prioritize in the future an they're going to come soon, for sure.
Eden: You had mentioned earlier that you've tried some of the tools, and there are also new tools that are coming out on the market. I don't know, is it actually possible to do some of the automation that you're looking for using Appium or, for example, I know React Native developers will sometimes use detox as a testing framework? Is there something particular about the implementation of Unity that makes it harder for your team to do automation? I'm just curious, why are some of the solutions constrained right now for your team?
Michael: I don't know the details of which tools we're trying out these days because I think we're trying them out here and there with different QA teams. I don't think I have any insights here to provide. But I would say that almost Unity would probably provide a better base for testing if anything else, because at least it adds a layer of commonality between all our products. Hopefully, even if there's a testing framework which works 100% on Unity, then hopefully we'll be able to make full use of it compared to maybe other gaming companies which have games on different engines, which makes things more difficult actually.
Eden: Yeah. I want to switch gears for a second, to talk a bit about... You had mentioned your company and your teams are impressively doing a lot of A/B testing and experimentation. I'm assuming part of the reason why this is possible is because a lot of these games are leveraging the platform of Homa Lab and everything is built into each other. But I would love to hear a little bit more about that if you have some time.
Michael: Sure, yeah. So A/B testing has always been one of these early Homa products that we decided to build internally for various reasons. This goes hand in hand today with the fact that we have our own internal analytics as well. This allowed us over the years to, first of all, perfect the product and really make it to something very promising today and that's used by all the teams, whether it's the live ops team to make incremental changes in the game or even external studios to try different rebuild configurations and really check which version of the game can bring in the most value to the players and revenue to the company.
Today it already allows us to do a lot and then really this is where we're at in this space, where we have enough users and we're going from the hundreds, scaling up the budgets so we can get a bit more users and we can get these thousands of users. Then it becomes very interesting. You can really change anything.
When you have small amounts of users, you need to be able to make changes that have a big impact so that they can be noticed.
For an example, you can completely change the game, maybe completely change level one to make it incredibly difficult or incredibly easy. You can change the camera angles, things that are very drastic and affect the player's perception of the game. This, even with 100 users or a bit more, we have a statistical analysis which allows us to detect if one version of the game is better than another.
All of this just from the comfort of our chair and just editing parameters and values and launching tests in Homa Lab, we can affect these new users and even existing users, playing our games just to give them different experiences. And so this at low user scale, then it's important to do huge changes. But of course as the game grows, then the changes you can make are going to be small increments and so while these changes maybe have less impact on the overall performance and the revenue generated, you're going to maybe need a bit more users to detect the changes that make sense.
But still if you get enough users and maybe one small change will need 10,000 users for variants or stuff like this, then it will still allow us to make incremental changes and really be 100% data driven in every single decision that we make, just to make the game better overall. So very small things here and there, modifying the color field of the game, modifying the frequency of ads, modifying the level difficulty, things like this. All of this can be completely dynamic, so even though two people have the same version of the game, the same game, and they could have two completely different versions of it and experience it tailored to their own behavior in the game.
Eden: That's very cool. I would imagine there's also your team has a pretty rigorous structure for managing and scheduling, and which groups of users get which results. Can you give me an example maybe, if you don't mind, of an example of an A/B test where it surprised you? Maybe the result was unexpected or unintuitive, but it ended up driving some really interesting opportunities for a game or for the company?
Michael: There's always cases. I think it's always interesting to try and understand why, because sometimes of course the results are as expected. But sometimes you need to really understand why the results are as is. We've had a few cases in the past where we've had the difficulty of balancing the frequency of ads and the retention of the users has always been very interesting to take a look at. This has happened for every single game.
Managing this is interesting because of course if you show too many ads, the player is just going to leave and if you show too little then the game is not going to make revenue, but maybe the user will stay for a bit longer. I think we've had a lot of cases where the balance there has been interesting.
Early on, I remember some ridiculous ones where we had a test which was completely crashing the game and we were expecting fantastic results for the change. But in the end it was completely crashing so we lost a bit of money that week because we didn't react fast enough to turn off the test. But, yeah, there's definitely a few cases and it's always interesting to investigate and you really have to go into the detail.
You have to take a look, okay, the big picture is revenue. Then you go into the details of, okay, how many rewarded videos are shown? What levels are the players stopping at? Are they stopping at level 15 or 20? Is there a difference of culture between different countries? Because maybe one level or character appeals more to some country or another. It's very diverse and requires a lot of time from live ops managers and data analysts to really understand these things.
Eden: That's really impressive just to know that there's a lot of different roles and responsibilities, and it's outside of just engineering when you think about designing A/B tests. The thought and intention that goes into it. I feel like we could spend a lot more time talking through this topic, I have so many other questions. Maybe that can honestly be its whole own other episode, but I know we're running out of time here. Really appreciate the time that you've given today. Any final perspective on either testing or software engineering that you'd like to share with our listeners?
Michael: Yeah. Automation is obviously the key. I think there's still some companies out there which are still manually testing everything, with teams of hundreds of QA engineers and are QA testers, may I say? So I think definitely that would be the main learnings of these past few years, is really hiring competent QA engineers is definitely a huge plus for the company so that you can scale, essentially.
Another one is that not only do you need to hire QA engineers, but you also need to make developers responsible for their own code so ideally... Devinder, our lead QA always likes to say this, that his goal is to become completely irrelevant in six months. Even though that's not going to be the case, the goal is really to have as many of the developers take ownership of technologies like Cypress and Cucumber, just so that they encode the end to end tests themselves without relying on a third party or another team like QA.
So definitely not only going for automation by hiring QA engineers, but also making sure that developers can take ownership of this technology as well. That's really the key learnings of Homa in the past couple of years.
Eden: Yeah. I'm excited for just the way that the industry on both web and mobile are going to continue to evolve over the next few years. One of my hopes for at least the mobile side of the industry, that we will get to a point where trying to automate for one of your games should be as simple as writing a Cypress test. I think there's a lot of complexity in the tech stack that makes that harder, but I'm excited about just the number of tools that are popping up, the way that people perceive automated and manual testing and physical testing is changing.
That's been a big part of what we're trying to do at Mobot as well. But yeah, I'd love to, at some point, have another conversation with you about this because I think the industry is moving so fast, and especially as you were touching on GitHub Copilot, generative AI. I really think things are going to change a lot in this next year. It's going to be super cool.
Michael: Definitely. I think AI is going to change the game, that's why also I'm a bit waiting, I would say, for some of these tools to come out so that we can just change our way of doing things and go for as much automation as possible. So very excited to see what's coming out, and we're exploring generative AI on quite a lot of different topics today. It's one of my main priorities, actually, in the next two quarters.
Really interesting, we're organizing a hackathon as well internally to see what are the best ideas that can come from the developers in the team as well regarding generative AI. So maybe there's going to be something related to testing, so we'll see this in a couple of weeks. Yeah. Very curious to see what happens and happy if you have any suggestions in the future, I will be listening to this podcast to learn a bit more.
Eden: For sure. Thank you so much, Michael, for joining me today. I really enjoyed our conversation. I think learning a lot about just how the incredible growth of your company and the infrastructure, the tooling that you've built in place to support 80 games. It's not just 80 games individually being pushed to the App Store, there's a lot of thought and strategy and best practices that goes into deploying for those day after day. So yeah, really appreciate the conversation and looking forward to keeping in touch.
Michael: Yeah. Same here, thanks a lot for having me.