How It's Tested | Ep. #4, Session-Based Testing with Jacob Stevens
Learn more about Jacob Stevens.
Eden Full-Goh: Hey, Jacob. Thank you so much for joining me here on the How It's Tested podcast. How's it going?
Jacob Stevens: I'm doing great. It's great to be here. I really love this idea and I'm really glad to be part of it.
Eden: Yeah, I really love having conversations with testing professionals like you because you have a lot of great experience. I remember when I was looking at your profile earlier, just being really impressed by the diverse number of industries, tech stacks, testing tools that you've touched on in your career and so I'm really excited for a deep conversation today.
Jacob: Okay, cool. Well, thanks. I'm definitely flattered about that. It's a bit serendipitous, it's not necessarily by design. I think it does come with the territory a little bit, if you work for a managed service firm as I did, or maybe a consulting firm or some people actually prefer the life of a contractor and try to do things as they bounce around. And so it just grew organically, though, for me.
Eden: Yeah. Just for some of our audience members to get to know you a little bit better, I would love to hear in your words what your career and your trajectory in QA and testing has been like because you've been in the industry for 20+ years. You've been an individual contributor, you've been a manager on different teams, and even 10, 20 years ago, the way that the world saw products, the way we saw mobile versus web and testing is very different now than it was back then. I would love for you to give us a brief overview of where your career has taken you so far and what you're looking for next as well.
Jacob: Okay, yeah. Well, so I kind of fell into it. I got into video game testing because I actually had a background within audio engineering, and so I started testing audio-video codecs at Microsoft after testing video games and... I don't know, I guess I was a little bit aimless back then. But I certainly gravitated to the inquisitive nature of testing. It's all falsifying hypotheses and stuff, and I'm definitely one of those types of nerds.
If you don't mind a really brief story, actually after a few years of just doing that as a test engineer, I went and picked up some .Net certifications, thought maybe I'd move into application development as such. So I had some of the technical coding skills as well at that time, but I really found myself very discouraged. I was probably a four to five year professional and really thinking what is next for me?
I tell you, the day that I just resolved where I was like, "You know, testing is this interesting thing but I think it's not really the career because there's so much expectation that I'll move on into development." I really didn't have the same focus. It's an orthogonal skillset, I think, of testers. Even though, it's also a very common career stepping stone for many people and that's perfectly appropriate.
It just wasn't for me, so I had this darkest hour moment, I was unemployed, and I thought, "I'm going to have to figure out my next step, but I have to continue to bring in income so let me just send out my resume one more time." I saw a job, applied for it, the dude calls me within four minutes. So he turned into my mentor, his name is John Bach.
His brother, James, is a fairly well known entity in the testing world, together they developed session based test management methodology and the whole profession just came alive for me at that time. So, I guess, maybe the takeaway from that for people is if you feel like you're not progressing as rapidly as you would like to or there's some gaps that are impeding your ability to find places where you're going to fit really well, I guess it's one of those don't give up things because it definitely was a pivotal moment for, obviously, my life and not just my testing career.
Eden: Yeah. For folks who aren't familiar, what is session based testing as a methodology?
Unpacking Session-based Testing
Jacob: Excellent. So really the thing I love about session based is it just amplifies the part that I would say matters the most which is really bug hunting. There's a lot of things that we do as testers, as product evaluators, as just quality assurance double checkers. There's a quality gating process that goes on at some organizations and you may be a part of that. The planning effort and everything can be very, very robust. I mean, if you're testing a medical device, you can't kill anybody, you've got some legal liability, so you have to be very robust and very, very thorough and very forward thinking.
Well, the idea behind sessions is that it's really difficult to be sufficiently thorough ahead of time before you actually have the product in your hands. It's kind of a way to do rapid testing, rapid results, and then it's time boxed.
Right? So that's why it's session based test management, the idea is you might just do a one hour session, maybe even a 30 minute session or 90 minutes, or above 180 and that's really the accent of session based testing. So rather than having a test case, you would actually have a charter. I have not used Mobot as a product yet, but let's just say for example, "Okay. I, as the charter idea, is I am going to itemize all of the different user types that have been identified and each one of those I'm going to assume the role of that user, and I'm going to make sure that all authentication is fine. I'm going to sign up and register, I'm going to sign in, sign out, sign in from multiple devices, I'm going to let my authentication timeout, forget my password."
Things of that nature. "I'm going to stop paying, and then see what happens." Then you just go and do it. It's really unguided testing, but I don't want that to sound as undisciplined as it might to some people. It's really just about the fact that even if you're at a very robust organization, mature organization, just running through a couple hundred test cases, the reality is you've probably performed checks or evaluations or tests of the product potentially numbering into the thousands by the time you've run through 200.
So it's really a way to have a lightweight footprint of your labor from day to day, where you're not spending so much time doing all this pre planning but you're going to do it on the fly. Of course that can meander too far, so then we time box it and then there's a debriefing usually with a lead or a product owner or something and you come up and you identify some additional charters, and then get that going. It's like a very formalized structure around exploratory testing.
Eden: Yeah. It's interesting because we encounter that question a lot at Mobot as well, and in my day to day in talking to our customers. It's like, "When is a good time to do regression test cases?" Where you have these very regimented, very clearly defined checklist that you're running through. But then when is it a good time to do what you're describing, of exploratory testing and session based testing, where you're really empathizing with the perspective of the user and really thinking about what their interaction and their user experience is like going through a particular product? And there is definitely, I think, a time and place for both types of testing. I'm curious, in your experience, where do you feel like the regression test cases, banging the same sign in, sign out, go through an authentication flow, going through the same thing over and over again? When is a good time to use traditional regression testing? And when do you normally deploy more session based testing strategies?
Regression Testing vs Session-based Testing
Jacob: Great question. I think of course it sounds like a cop out answer to say, "Well, it depends," and cite the context. But I think two of the parameters that really help to drive that a lot are going to be the maturity of the product, but then also maybe the number one thing really is just going to be the release cycle. I suppose that lends towards your methodology, Agile versus Waterfall, things of that nature.
But if you're really adept in Agile or SAFE then you might be... If you're a web application you might be putting something out potentially every single week, places like Amazon, they can release on demand and they've got the microservice architecture. There's all these individualized components and they go through their whole vetting process and deployment quality process, and then once it's ready then they can even be releasing multiple times a day, even, if you're Facebook or Amazon.
So that's going to have a lot to do with it, and so as a manager I have always tied it into the whole deployment process, if that makes any sense. From there, if we want to get more specific then it's what kind of details? You had mentioned some interest about the mobile application work that I was most recently doing as a Senior Director of Mobile Application Development. So if you don't mind, I can use that as a contrasting example.
Eden: Yeah, I would love to.
Jacob: Okay, good. So it was mobile application development, but what makes it a little bit unique is it was a white label program. So this was for Springbig last year, and Springbig is a marketed automation and kind of a CRM platform for consumers based on loyalty management where they earn points and then they get rewards for being loyal to your retail outlet.
We just happened to specialize in cannabis for a few reasons that are not important here. So the thing about having a white label program is ostensibly the codebase is not going to change nearly as much and therefore you've got to figure that the risk is not going to be quite as palpable and so you're going to have some options on there.
So the question about regression in that sense was a lot of people will just stick to what's the core functionality, what are some of the most important defects that we cannot allow to regress, which is where the term comes from, right? And things of that nature. So we had that, and what I recognized is when we're doing that over and over and over again and the codebase is not changing very much, we're really not doing very much to mitigate risk at all and it's kind of a reasonably heavy effort, kind of high cost to do that repeatedly.
Even when we augmented our manual testers with tools such as Postman and the Collection Runner where you can hit a whole bunch of APIs and make sure that all the returned values are correct and all the expected failures are correct. It's still a heavy lift, and so I don't like that those things would get stale. So what I did is look through our TCM and I would sort by the most recent result and so we had about 14,000 testers that are in the TCM.
The reality is we'll be testing new features, there's a lot of exploratory testing going on there, those things will get hit and then with regression just being so stale and just recursing through the same suite, you know those 14,000... It's almost useless, we may as well throw it away if we're not going to hit that at least once a year or something. I recognize the risk level is necessarily there, but what do we do?
So I came up with three categories for our regression cycle, we'd have the core and that's going to hit every single time. Then we would have release specific cases and, oh, by the way, this was somewhat holistic as an approach because we had a web platform that the mobile apps were serving and we also had web clients as well.
So architecturally, nowadays a lot of frontend frameworks are going to be, say like with Flutter, yeah, you're going to write a bunch of mobile applications but you can have the web app, the web client as well. So that's what we had, so you've got a core suite, then you've got something release specific dependent on what's going out the door on the web platform or the mobile side if there's some new feature or maybe the microservice architecture that's supporting it all.
Then, like I said, those old cases that had the last run. It was a little bit too much to do everything, so what I did is I had two cycles. We'd have the core suite and we'd do that for a month or two, then we'd have just a quick meeting and go through some things like what's in the middle is still important. It's a good regression case, but we're not hitting it enough and let's cycle those through.
So we tried to refresh a third of the test cases every couple of months. Then the same thing with the oldest ones, it's like, okay, what are the oldest 50 that no one has ever touched? If they're not valid, then test it and make sure it's not valid, then delete or deprecate that test case. Then the next time we have a new 50 because those are no longer at the bottom. Anyway, hopefully that wasn't too convoluted or hard to follow for the audience, but that's what we did.
Eden: I think that's a really healthy and strategic approach to testing. This is something that we encounter a lot at Mobot as well, is often when we engage with engineering teams or QA or product teams, they're just like, "Yeah." Like you were saying, "We've 1,400 test cases." And it's not worth it unless you test everything.
But then when you actually drill into, intellectually, what are the features that make the most money, that generate the most revenue, or result in the highest user conversions, or you know it's the most highest trafficked parts of your application, or it's the most business critical, it's the most regulated part if it's a financial services application, or it's a digital health or a medical device like you were saying.
I think you really have to take a thoughtful approach to QA, of course if everyone had all the time and money in the world we would test everything all the time. You would get a robot to test everything all the time. But I think that's actually a really thoughtful point, is you have to think through why are you testing this thing? Why does this particular use case, this test case, matter?
That's also where human testing and manual testing continues to play a role, because even with all the regression test cases in the world, you're not going to be able to cover all of the scenarios that a human would do. There's just too much creativity and ingenuity in that part of the exploratory session based testing process that we should continue to capture and a good, healthy QA team is going to use a combination of automated and manual testing tools.
Jacob: Yeah. No I totally agree. I think, Eden, obviously you have to be very familiar with the fact that there's constantly this question of how much test coverage is sufficient? It's very difficult to quantify it, but I think maybe the biggest complication is outside of us, nobody else cares. They just want to know when is it done? Call it done. "It's your responsibility, if you miss something it's on you so you tell me if you're done. If you're not done, we'll give you more time."
But then the clock is ticking and they're getting impatient. And so managing that, managing the expectations and to anyone who wants to understand why that must be substantiated, it's good to evangelize that out so that people understand. But for the most part, they don't and so whether it's an individual exploratory test session or an individual test case or just managing it at the managerial level, it's just this thing that we constantly have to deal with.
Eden: Yeah. I guess in your experience, given all of the different mobile apps that you were helping to oversee and make sure that at Springbig you were able to ship everything on time, I'm curious, all of these different white label mobile apps, were they all on the same release schedule? Did you test all of them, every time it went out the door? Or how does that work?
Customer Satisfaction Management
Jacob: No, they were not. Okay, so it's a bit interesting. Here we get into some of the concepts of customer satisfaction management, I think. Actually, if you don't mind, I think that is one of the brilliant part of Mobot's model, is that it's not just a bunch of robots doing automated testing for you, but you actually... We, as a client of Mobot would get a CSM. So I think that's fantastic because as my career has grown and progressed, I've just interfaced with whoever the customer proxies are more and more and more.
Which, I think as a QA guy, as a tester, that's certainly very healthy, all the way down to every single individual that might be assigned to a team. But certainly at the management level, I just think it's absolutely imperative. So with Springbig, we had CSMs for our clients and then we had product owners in a traditional, Agile, quasi SCRUM kind of a situation, so who's the one that makes the product decisions?
Well, that's the product owner or the product manager, whatever you want to call that individual. They make those decisions and as long as they sign off, basically it's like a multi partner checklist. Test signs off, product signs off, and then the deployment team signs off because we verified that the PRs have been merged without conflicts and all of the unit tests are still green, all the tests in the pipeline are still green, the extended end to end tests in the pipeline are green.
There's a number of other things. Did we change database schema? Do we need to notify anybody? Is this safe to deploy? We would deploy in the morning when hardly anybody was using the product, but there was some items that were just that more risky and we'd actually do that at like 2:00 or 3:00 in the morning and have people on call and do that, just be prepared to roll back as they monitor things through their AWS dashboard and all these kind of things.
Okay. So it ended up being driven by CSMs more than a product owner. What we would do, we ultimately had a really big backlog of excited customers that were waiting for their own individual branded app. These were cannabis retailers. I think unfortunately before I got involved, we as an organization began to sell it before it was really ready. But there was a lot of excitement.
If I had to do things over again, I think maybe the biggest mistake I probably ever made was not running it up the flagpole and saying, "Look, can we find a way to compensate our customers and let them know. Let's unrelease this, let's give ourselves a month or three to really get on solid footing." But otherwise we were just behind the 8 ball on a constant basis, which is no fun for anybody but also strikingly common.
We're all used to it, so we've all run that way. We continually added features and rearchitected and refactored and extended the codebase that would build a white label program, and then we would have their individual assets for the branding. But those were almost just individual mini projects where there was, yeah, less actual work to go into it, less risk that's legitimate, but it was still going to be a heavy lift because there's going to be dozens of them and we had to manage this divergence of expectation and needs.
It became very important to have a very intimate relationship with all the CSMs because if we allow every customer to take ownership of their app, which is fairly fully reasonable for them to expect, then we would be doing ourselves an immense disservice because now it's just going to be divergent feature, divergent versioning and it would be really bad.
So anyway, I'm trying to pull that back into what that meant for quality, but really that was why we had to get so nuanced with our regression cycles, because we had one to three that were about to go out the door. Really if they go out the door, that's more controlled by Apple and Android than us. From there, it ended up being more like a traditional release process of going through everything. Anyway, I hope that answers the question.
Eden: I see. So even though there was over 100 of white label apps in the portfolio, you were covering two or three major ones every single sprint and trying to make sure that those apps were ready for release, is what you're saying?
Jacob: Yeah, yeah. Exactly, that's right. So we had some versioning of the white label architecture so once those were out the door and customers were using them and we knew that we had to continue to support it in a way at least where we weren't going to be breaking it, as we were making changes to the microservices and the web applications and stuff. That was a whole other piece of that multidimensional regression sweep that we had to do on a regular basis.
Eden: What was the ratio of automation to manual testing for these white label apps that you were supporting? I think I saw that you had mentioned the apps were React Native, so does that mean you were using a test framework like DETOX or have you played with a bunch of different ones?
Automation Vs. Manual Testing
Jacob: Let's see, we had Opium. But honestly, our automation was way more robust on the web application side with Cyprus, and so that was probably going to be another step. I'm no longer there with Springbig, we had to restructure and the economy is down for everyone. It is what it is. Yeah, so I've used Opium before at a couple of locations, I like to pair it with Testing because of the test parameterization capabilities that Testing had.
So I had similar designs here at Springbig for our regression suite. But really, it was just very rudimentary, just sign on, do a couple things and make sure a couple of pages were there. It was just at the beginning stages when I took off, so I'm not sure where they're going to take it from here.
Eden: That's a pretty normal experience, that at least we've seen when we talk to different engineering teams and testing teams that we encounter. It's that there is an intention to build out test automation. There are these great frameworks out there, but the reality is that the tech stack is complicated, it's not as straightforward as with web apps, and so there is a manual testing effort that's supplementing or just doing most of the testing until the automation is up and running.
I guess I'm curious from your perspective, since throughout your career you've built and automated testing for web and mobile apps, what is it in particular that makes mobile so much more challenging than web automation?
Jacob: Good question. I guess the first thing off the top of my head is the reality is most of your scenarios have these valid permutations which pretty frequently are ignored, that I know because looking through the spec sheet and stuff on Mobot, that you guys are pretty well built for this stuff. So push notifications popping up, app switching, taking the app to the background and restoring it, you're dealing with not only two platforms which that's not a major problem, to be honest.
But you're just dealing with, at a scenario level, every scenario is a user on a phone. So everyone is just so adept at their own way of interacting with their phone, right? And some people might be heavy users of Sire and the voice commands and things of that nature. You can look at the core functionality, you can look at the core use cases of an app, and its features and how well it behaves. But it's just not able to be as isolated as a web app or a desktop app.
With a desktop app and stuff, it's like, okay, the operating system has multithreading, you can task switch back and forth and if anything goes really weird there, for the most part, I think there's a reasonable presumption that it's not our fault, it's not our problem. Let Microsoft and Apple worry about those things. But I think we've all seen mobile apps get really weird in those scenarios and so I think that you can't shed the responsibility the same way on a mobile platform.
Eden: For sure. So it sounds like what has ended up happening is when you were managing teams of people on your testing team to work against some of these challenges, you had to pretty much build, it sounds like, a very robust session based or a manual testing strategy to account for all these different scenarios. How did you organize your teams effectively to make sure that every tester, every analyst was marching along a coherent QA strategy? Whether it was in alignment with the web product or the mobile product? How do you actually resource a testing team that way?
Jacob: Yeah, good question. So on the mobile side of things, really a lot of the exploratory work would be done on new feature testing. I've definitely felt the strain of that logistically for a white label program. It's like, "Oh, we've got dozens and dozens, upward of 100 apps to support. It's only going to continue to grow."
And so if you're testing a new feature and to what extent are you actually going to give really strong regression coverage in an exploratory fashion over the core features and structure of the app that we already have? So that was really a pain spot for me, but I think that's the nature of the whole beast of being a white label program. You're just going to have to tackle that.
To be honest, I never really tackled that really as well as I wanted to. But then on the full stack web app side of things, as I mentioned, that was where it was definitely a long term, multiyear evolution to get to that level of regression nuance with multiple cycles inside of multiple categories. Also, we would also cycle the team for who would do regression testing because it can still be a heavy lift and so we had a dozen people and so it's like, "Hey, can we interrupt our teams less and help them to stay focused on what they're assigned to do, what they love to do?"
Most people want to look at the new stuff, it can get pretty boring in terms of... and to keep your vigilance up on regression testing, that is definitely a challenge so we would just build up a schedule of four to six weeks and just rotate through people so basically about once a month they're going to be on the horn to plug half a day in on regression so we can sign off, deploy, and then they're back to work as usual.
Eden: Do you actually think it's going to be possible to get to a point where mobile automation testing frameworks are going to be as effective and efficient to use as what you are using for Cyprus for web automation? Do you think we're going to get to that point with mobile?
Will Mobile Testing Catch up with Web Testing?
Jacob: In a way I feel like we have, but in a way I feel like it's not sufficient and it probably never will be. Again, I have to say I think that's one of the strong advantages that you're bringing with Mobot with your own driver, but then having that full stack capability which is the fullest of stacks, right? Because it's all the way through the touch kit, so you can't do any better than that.
I guess one thing that we really haven't touched on that's unique in the mobile world is the tiers of devices. Device compatibility, OS version, compatibility matrix is a beast compared to the web world, compared to... Yeah, I was doing desktop testing and then web 1.0 testing as you mentioned, ages ago, decades ago. But in the mobile world, it's like, okay, a lot of people are going to support tier one and tier two of devices.
Honestly, I would usually just let browser stack define that for me. I'd say, "Okay, guys. We're going to do tier one, 50% of tier two and cycle through them, alternate back and forth to broaden your coverage." But very quickly if you go any deeper than that, the cost and time just becomes absolutely cost prohibitive, hardly anybody does it unless you're huge like Facebook or Twitter or something. But the risk just grows and grows and grows as well.
So I think the most practical reality is the vast majority of product owners and organizations out there just crowd source to their users. Very few people actually care about device specific bugs until someone logs a ticket, like, "Please, I can't get the damn thing to work. Can you do something about it?" I mean, if they're a paying customer, boom, you got CSMs or whatever and that's going to get rectified.
I really feel like the whole industry has just not sufficiently solved that because the deeper you go, the more expensive it gets and it's just how much discipline do you have to actually chase that level of quality? But I love what you guys are offering in Mobot to be able to tackle that because it's really untapped everywhere else, I think.
Eden: Yeah. I think historically it has been so challenging to make sure you corral all of the right devices that match your demographic, your users, your customers and how they're interacting with your mobile product. I think part of our thesis at Mobot is we think this is a problem that's going to get more exciting, more painful to try to wrangle.
But I think that's also what's exciting, is that hardware is easier than ever for someone build their own OS, for someone to build their own mobile phone. Right now it's is and Android, but 10 years ago it was Windows Mobile and Blackberry were all the rage. I kind of wonder, what is the next five, ten years going to look like? I think there's going to be more opportunity to test.
Jacob: Right, who knows? Absolutely, that's such a great point. I mean, a lot of people are going to presume that, "oh yeah, over time things are going to solidify and it's going to get more stable. It's going to get a little bit easier." Those are some of the ideas with, yes, React Native is taking over the world, React has taken over the world, JavaScript has taken over the world, and that's fantastic.
But those are also a testimony to every couple of years that there's some new hotness, some new JavaScript framework comes up and just takes over everything. I love that you can write once and deploy to two platforms with React Native. That's great. You're also compromising on a number of other fronts because if you're going to have Typescript that compiles into React Native, which actually compiles into C, and the security is just not there.
Well, it's not robust. It's still very, very young. That's great to expect that a roadmap is going to solidify, but then look at what everyone is experience is with USB. Now people are pretty much used to USB-C, but of course there was like a million products that were manufactured with those USB-A plugs just assuming that nothing is ever going to change, and Apple will force people's hands on all sorts of new hardware standards.
Lots of people are like, "no. Okay, I understand it's better but it's kind of like this problem..." Then, like you're saying, before you know it, those old standards are just gone so really it's, no, none of us can ever rest on our laurels. We've got to stay on our toes the whole time.
Eden: I guess the final question from is what do you think a testing professional, an engineering professional can do to stay ahead of all of these evolving technologies? These frameworks, there's more different automation tools coming out all the time. In your experience, what would you recommend for our audience to stay agile?
Staying Ahead of Evolving Technology
Jacob: That is a great question. I wish I had a really good answer because I can't think of anything better than just do your best to stay abreast. But also probably know where you're going or where you're wanting to go. I think there's probably some room for some of us to be jack of all trade generalists, but over time, for example, the cloud. I was in Seattle for 24 years, I saw the cloud mature but I was doing mobile the whole time.
So I am not a cloud expert. I don't even know if I'm a mobile expert, but technology is just too big to take on everything and to be an expert and just to have the bleeding edge skills and to do the things that are the most in demand. So I think chances are, as you progress, you're going to find your niche. It might be web, it might be cloud, it might be mobile, it might be whatever, might be games.
They each come with their own little dynamics and stuff, and so that's going to continue to change. But at least it's manageable, staying up to date is manageable. But if you're resisting that, I don't know, I guess I might ask why. If you're looking for a job and like, "Oh man, they always need all these cloud technologies." But then you're looking for another job and, "Oh man, they need to have some strong mobile experience." Well, I mean that's just what people do, specialization. It ain't just for the insects, I think it's for us as well.
Eden: Yeah. I'm excited. I think in the next few years, the way we think about test case management, test case design, automation, physical testing, when humans should play a role in the testing process. I think there's so many great tools and solutions that are coming onto the market as our different OS versions and Apple and Android are evolving too.
I'm just along for the ride and really excited to see what comes next because I think more than ever before there's new ideas and a lot of fresh innovation coming out. I think in the last decade for a while, there wasn't a lot going on in the testing space. I think there were a couple of big players in testing solutions, but I think I'm hearing about new opportunities and new ideas all the time. Especially with what's been coming out with GPT, I'm curious to see how that will impact our industry and make testing easier for automation and manual testing.
Jacob: Yeah. Same here. I love it.
Eden: Well, thank you so much, Jacob, for taking the time to join me on our podcast. I really enjoyed our conversation and just learning more about how your team and the processes that you built at Springbig and just throughout your career. I think you have a long and very fruitful career, and I'm sure much more to come. But yeah, it was really appreciated, you sharing your expertise with us.
Jacob: Thank you so much. The pleasure is all mine.