Creativity, Music and AI: Bryan Jones and the complexities of testing AI systems
Bryan Jones
2i
This episode was recorded LIVE at the EuroSTAR Conference in Stockholm.
Thank you to our Community Hosts for Season One, Russell Craxford from the UK Department of Work and Pensions, and Gek Yeo of AIPCA.
In this episode you will hear from Russell Craxford from the UK Department of Work and Pensions with guest Bryan Jones, Passionate QA professional for 35 years and people-centred Quality Leader working for 2i as a QAT, as well as a podcast Host and popular conference presenter. Bryan Jones talks about music and AI, the complexities of testing AI systems, and the new jobs it is opening as well as the need for new approaches and skills to handle the probabilistic nature of it.
We hope you enjoy.
Episode 3 Transcript – Bryan Jones and the complexities of testing AI systems
Joseph: Welcome to the Eurostar Community Podcast. For over 30 years, we have been bringing the global testing community together to share knowledge, connect, and grow. Check out eurostar conferences.com for our in-person conferences and access to online testing resources. Thank you for listening to this podcast.
Our first season was recorded live at the Eurostar conference in Stockholm.
In this episode, you will hear from Russell Craxford from the UK Department of Work and Pensions, with guest Bryan Jones, passionate QA professional and people centered quality leader working for 2i as QAT, as well as podcast host and popular conference presenter. Bryan Jones talks about the complexities of testing AI systems and the need for new approaches and skills to handle the probabilistic nature of it.
Also, he talks about music. We hope you enjoy.
Russell Craxford: Hello and welcome to another episode of the Eurostar podcast. I’m your host today. I’m Russell Craxford. With me, I have a special guest.
Bryan Jones: Hi, I’m Bryan Jones. I’m currently the QAT architect for 2i in Scotland.
testing consultancy. I’ve been in testing for about 35 years and I’ve done every possible job you can imagine in testing, from the lowest of the low to running a test practice as a director of test practice for Soprasteria.
Russell Craxford: Brilliant. Thank you for joining us Bryan. What do you think the future of testing is looking like?
Where do you think we’re going in the years ahead?
Bryan Jones: That’s Question is it’s got several directions. You can go in with this one. I think we’ve been talking about the same things for the last 20 years, so I think we’re probably going to carry on talking about the same things for the next 20 years, and a lot of that is around things like how do we test better, how do we shift left, how we do more earlier, how we shift right and start absorbing more from the opposite side of things.
These are all conversations we’ve been having for a long time. Yep. And I don’t think that’s going to change.
Russell Craxford: You don’t think we’ll solve testing?
Bryan Jones: No. Fair enough. Definitely not. And we also talk a lot about how do we get a seat at the top table? How do we talk to the C level executives? How do we get them to listen and to invest in us?
Russell Craxford: And to actually value quality.
Russell Craxford: Okay.
Bryan Jones: They value their own values, but they don’t realise that quality is what enables that value. Yep. I think those are going to be conversations that we’re still having.
Russell Craxford: Makes sense.
Bryan Jones: In 20 years time. On the flip side of that, technology is changing. Yep. So there’s going to be new tools, there’s going to be new approaches, and of course everybody’s talking about AI.
Russell Craxford: I’ve heard of it somewhere, sometime or other, yeah, this is a topic that’s been on the radar I think, yeah.
Bryan Jones: Yes I spent an hour yesterday. Talking about that on the stage at Eurostars. Ah, yes.
Russell Craxford: Well, tell us a little bit about your talk then. Tell us, what’s the key messages, takeaways perhaps, or what was the kind of purpose. And I’ll ask you to repeat the whole talk, don’t worry.
Bryan Jones: It’s basically around giving a framework of how to approach testing AI based systems.
Russell Craxford: Okay.
Bryan Jones: So at the moment most conversations around generative AI and large language models and how to use those in testing.
Russell Craxford: Got you, so co pilot and using them to assist your testing type thing.
Bryan Jones: Yes, or embedded in tools, most of the tool vendors here have all got AI clusters.
Russell Craxford: Self healing tests or analytics of tests and things, yeah.
Bryan Jones: Exactly, so I’ll leave that to more erudite folks and I’ll leave that to the tool vendors to talk about.
Russell Craxford: Fair.
Bryan Jones: More interested in that. How do we actually test those AI based systems themselves? How do we prove a degree of trust in them?
Russell Craxford: Okay, yep.
Bryan Jones: There’s a whole load of techniques you can use. There’s a whole load of challenges around AI. So I’ll go through all those and talk about the techniques you can use.
And the test levels themselves, they don’t actually look that different. But we do have to have a different set of skills and a different way of thinking about them.
Russell Craxford: Yeah, that’s it. I’m not an expert in AI space, I have to admit, but a lot of it in my head so far from what I’ve seen is shifting away from this binary true or false logic that you expect a line of code to follow a truth statement, an outcome, an exact thing, whereas a lot of the AI things are more statistical probability type of things you’ve got to learn about and you’ve got to understand more and certainly in places I’ve seen it, it’s, you’re working in much more of almost data science kind of knowledge space, not necessarily Exact space, but understanding data, statistical analysis on all those worlds, which I find really fascinating.
I love data. I’m a bit weird, but aren’t those all testers? Yeah, testers, many of us are weird. So I guess do you see, I’m AI is one of the things and seeing more people putting AI into their solutions. And their products and things like that. So as testers we’ve got to start learning, as they said, the levels of it are pretty similar probably, but actually the models that we use, the techniques we use will be evolution on current ones.
Bryan Jones: Yes, I mean I’ve presented this talk a few times and each time I’ve rewritten it before I’ve presented it the next time because things have moved on. There are so many papers coming out, academic papers about it. Statistical analysis and data techniques and also on the AI techniques themselves. So you need to keep up to date with that.
Yeah. To actually understand how things are changing and to stay on the forefront with it, which isn’t easy. There’s a lot of it, an awful lot of it.
Russell Craxford: I mean, we’re in, I guess, a technology discovery phase, aren’t we still, for AI? We’re trying to work out the patterns that are most effective in the long run.
We’re experimenting a bit at the moment. We’re finding what works, what doesn’t work. We’ve got probably a bit of blind trust with it. I think there’s a little too much blind trust at the moment.
Bryan Jones: Yeah. I’m a skeptic at heart as a tester in my head, you’ve seen, I guess, examples of AI tools from our big vendors that have gone into production and been pulled due to, , issues around bias, self learning models, not having the right constraints.
Russell Craxford: We are still, if our big Big companies out there are learning than obviously the smaller companies out there you’d expect to be because they don’t have the bandwidth, the money, the knowledge base to draw upon, they don’t have the scientists, the researchers, I’m going to use Microsoft as an example, but they have a large community around them of scientific people and all sorts of different things helping, guiding.
Bryan Jones: I think there’s a different way of thinking about it necessary in terms of. How you actually released to the public as well. Microsoft made a big mistake with their chatbot Tay. That went from humans are super cool to complete full on Nazi mode in 16 hours on Twitter. Yeah. And that sort of understanding how these things are going to be used by the users, how that interaction is going to change it.
And the adversarial nature of the fact that they will change the data on the input. The will change the behavior on the output. Yes. And you can’t predict that up front, but you have to sort of anticipate that that’s going to happen.
Russell Craxford: Yeah, I guess you’ve got to set some parameters, some boundaries, you’ve got to try and think of the best and you’ve got, I guess, respond fast when things do.
You’ve got to have the feedback loops, the observability, the monitoring to see if something’s going awry. And ideally, you said 16 hours from zero to from hero to zero, maybe the reverse way. You’ve got to be able to obviously spot that and as a big company that you’ve got operational risk there And it’s hard like we’re used to spotting things going wrong over a period of time But yeah, as soon as people start realizing it’s leaning a certain direction.
I assume like you get gold rush type syndrome which I’ve just made up on the spot, but you get people rushing towards
Bryan Jones: kind of That’s a good name though. I like that one.
Russell Craxford: I’ll use that one. Yeah, exactly. Thank you. So, you know, creativity on the spot here. But you rush towards kind of the outcome, the idea of seeing it.
You want to get the, you want to get the gold nugget. You want to see it fail. You almost start trying to actually direct it to fail. Yes. So if you see cash coming out of a cash point for free and no one’s going to bank out, everyone wants to get that cash. And it’s the same sort of thing.
People like to see failure. Yeah. To see it. So you see it nudging towards, Oh, look, it could do this thing. I shouldn’t do this. Then you’re going to get what’s it called? Sort of viral sensation kind of thing going on. And that in the SEC is going to snowball behaviors of AI systems. It’s going to completely introduce bias.
Bryan Jones: And social media amplifies that. Great. Yeah.
Russell Craxford: Certain platforms have different biases built into them, the communication. I think I’ve seen some experiments. It’s with AI using to analyze customer emails to kind of help figure out how distressed people might be or who you need to respond to first, you know, prioritization, which is quite cool.
Bryan Jones: Emotional indexing.
Russell Craxford: Yeah, which is really quite cool. But I think one of the ones I was hearing about had actually just taken kind of the average IQ, the average kind of literacy levels. So actually it didn’t take into account that people have average of that, but actually your range of data is huge.
Right. So when it got people that couldn’t spell or use really big words and things like that, it muddled up the kind of outcomes. So it was proven to be fantastic at the average IQ, average reading, writing level, but not at the range of society. And we’ve got to start thinking about ranges more and variety, I think.
Bryan Jones: And that comes back to something that I was talking about in my talk around the data testing that diversity and representativeness and the bias And the fairness and the data splitting, if you’re going to have a big data set, you want to make sure that data set covers all the possibilities, it’s going to be everything that’s going on there, all the different categories, and you also need the categorization such that it can actually accurately spot what’s going on.
So taking your example, words have nuances. And is that categorized correctly, those different nuances between the different words? Because if you look at it in a thesaurus, that’s not going to help.
Russell Craxford: Yeah, context is lost to a degree. It is interesting though that these technologies and tools, we’re still discovering how we get the best value out of them, I think.
Yes, very much. And if we’re discovering the value, then we’re definitely trying to still discover how we test, how we assess, how we look at, how we give feedback. And I don’t think, I think you said that the future obviously is to keep going down the same tracks, technologies, leadership, management. I think my worry on AI is persuading people of the value of quality because they, they see it as a kind of a given.
A lot of models seem to be off the shelf. Does that make sense? So there’s, there’s a trust, implicit trust there, which I think we do have for some software. You know, we buy a vendor tool, we expect it works. When we integrate it to our stacks, usually we do some techs, but we don’t retest Microsoft Office every time we install it on our computers.
We trust. We have to have a level, but I wonder if the maturity of the domain of AI deserves that level of trust yet, because I’m sceptical.
Bryan Jones: In my opinion, in some areas more than others, so for instance, the large language models that are commercially available. Yeah, it’s well known problems with those and you can pay for an API to actually interface to it and yeah, integrate it into your whatever your system is, but I don’t think we should be trusting of those models yet.
I don’t think they’ve earned that level of trust, whereas the library elements you can get from Python, for instance, like SciPy and PyTorch, They are really well tested, and they stand well in isolation. It’s when you start integrating it with other stuff that the problems happen.
Russell Craxford: Well, yeah, it grows up this whole maturity of integration, and I think I was doing an experiment with AI not so long ago, and you were integrating multiple types of models together to get an outcome, and each model may have been perfect in its isolated, controlled state, but you’re creating a new.
Uncontrolled state I think to start with in a company when you start borrowing and Doing different things with it to anticipate.
Bryan Jones: You’ve been listening to my talk haven’t you?
Russell Craxford: I actually didn’t get to it Sadly enough. I was probably recording a podcast at the time But yeah, I just I think AI is a big of a buzzword and a lot of us I know are getting sick of it a little bit to a degree But I do think that a lot of it is starting to mature Our understanding is maturing how it will benefit us how it won’t benefit us It started off In my head back in 2023 really about it’s gonna solve everything and then I think maturity is growing. Actually, it will help with these areas best.
Bryan Jones: Yeah, and that’s the key word help. Yeah, it’s an assistance. It’s not going to replace anybody. And if anything, I think there’s going to be more testing to be done.
Russell Craxford: I liked it. I think in other episodes to like when mobile phones came out. It creates a whole new space of technology, of risk, of different things, of tools, techniques, things that we need to understand and deal with.
And AI has done pretty much the same. It’s created a new space. And as you said, someone’s got to test the AI. Yeah. So in a sense, it’s created a new industry. Whilst AI may help us be more efficient in some areas, it may build our products better, it may go into our products. It’s a whole new thing.
So that leads me on to another question, which is testers and people. We are people, we can’t know everything. So how do you think people need to cope with, advancing their knowledge around AI versus everything else out there that testers need to, or are expected by C level, shall we say, to perhaps understand everything?
Bryan Jones: I think we need an appreciation of AI and its challenges. We need an appreciation of data science and statistical analysis. But the basic underlying skills of the tester are still the same.
Russell Craxford: Yeah.
Bryan Jones: Focus, curiosity, critical thinking, systems thinking, models, and communication. It’s still exactly the same. We need to ask those awkward questions.
We need to dig in, anomalies, and help the data scientists be better at testing their own data, getting their data right in the first place. The same as we, we should be doing that for requirements gathering, same as we should for design, same as we should for development. It’s exactly the same set of skills.
We just need that overlay of an understanding.
Russell Craxford: So a bit of a, yeah, context of the domain, a little bit of AI to understand it better.
Exactly.
We’ve gone down the AI route a little bit. Obviously, your talk was about AI. Do we want to go down the music route? Is there any other angles you want to talk about, Bryan?
Bryan Jones: I don’t know. I think on the AI route with the music, that’s starting to get really interesting, legally speaking.
Russell Craxford: Okay.
Bryan Jones: Because certain platforms are actually allowing AI generated music.
But the AI cannot copyright it.
Russell Craxford: Okay.
Bryan Jones: Because the AI Isn’t an entity, so therefore it cannot hold copyright on the music.
Things are starting to get really interesting. Some of the big companies that use AI systems are starting to think, Hang on a sec, we’re going to lose our intellectual property that the AI is generating, because we can’t copyright it.
Russell Craxford: Yeah, because copyright laws and other things are generated, are assumed to be around the process of humans generating information.
Whereas now we actually have artificial intelligence, or we have Robotics and things generating stuff and yeah, the concept of that doesn’t hold true in most of our sort of systems and societies and things. Yeah, so
Bryan Jones: I think that this AI is not going to take over the music industry either.
Russell Craxford: That’s a shame.
No, I don’t mean that honestly. Again, I’m guessing it’s kind of going to help and assist different things that you know, if like someone wants to generate a backing, I Drumbeat or something else like that. Theoretically, it would be much easier. But, you know, you can go to libraries now to get some of these tracks and things like that.
I’m guessing it just, it, it improves the efficiency of finding what you’re looking for and things like that or cleaning up background tracks and things, get rid of background noise perhaps and stuff.
Bryan Jones: There’s lots of AI assisted tools starting to come out now for helping with the mixing, helping with the editing.
With balancing the sounds off and like you say, taking out the noises that you don’t want that. It’s also getting built into, for instance, my, my guitar. Yeah, the app on that, you can play the tune and it’ll go, the tempos, this, the keys, this, this is the chord sequence and here’s a drum pattern for it.
Oh, and do you want the baseline? Here’s a baseline. That’ll go with it.
Russell Craxford: That’s pretty handy.
Bryan Jones: So it’s pretty cool in that respect.
Russell Craxford: But again, it’s still, you still have the artist, you have the choice, I guess. yes. Um, you know, Do you want to use that, is that the best example? It gives you a hint, it gives you a steer, but it’s still the human’s choice which way they want to do it.
It probably doesn’t have to be, you could probably get it to automate it all, but actually that’s still the skill, the art, the human aspect of these things, that you come in and you choose, well actually no, that’s not what I want to do. That’s not quite what I had in mind.
Bryan Jones: At the moment, it still can’t do creativity.
Russell Craxford: Okay.
Bryan Jones: But then, we start getting into a discussion around the neurology of creativity and how that actually works. Yeah. And the fact that we’ve got a library of information in ads and we pull down and we twist it and then we get something new out of it. So, theoretically, we could get to the point where it’s being creative, in inverted commas, but It
depends
Russell Craxford: on definitions, I guess, yeah.
Yes. Yeah, I’ve seen some AI art and things like that and things like that have been produced in the past, but it’s algorithmic based type, follow a pattern, look at 500 paintings and create something similar to type models, yeah.
Bryan Jones: And again, it’s like, how do you judge the quality of that art? And again, that’s going to come down to humans.
Yeah. Whether it’s good or not.
Russell Craxford: Exactly, at the moment, most of the judgment of these things is still the human factor at the end of it, it’s the human that says. I like this, I dislike this. And yeah, it’ll be interesting to see how, these things shape and change. Because I think it is here to stay.
It’s going to evolve itself and it’s going to find niches in our society.
Bryan Jones: Yeah. And that sets me off on another rant.
Go for it.
What’s good and what’s not good. I mean, we’ve been digging into this for over two and a half thousand years now.
Yeah.
Bryan Jones: The Greek philosophers couldn’t agree on it. And we’re still arguing over it.
You’d ask. Anybody in here and you’re going to get dozens of different definitions of what quality is.
Russell Craxford: Oh, yes.
That’s one of the fantastic questions to ask the tester. Define quality.
Bryan Jones: Yeah, well, my definition is a sort of cross between Jerry Weinberg and Robert Persig, who wrote The Automotive Maintenance. It’s delivering value. Rather caring about delivering value to someone who cares about the value you are delivering to them.
Russell Craxford: Okay.
So I recognize the Weinberg in that. I think I’m more aware of the back Bolton Weinberg combination about delivering value to someone who matters type one. But as you said, it all varies and it’s nuanced a little bit to a degree, but it’s all about trying to work out who the stakeholder It’s important and then it’s all along those lines, but everyone has nuanced view of it based upon their experience.
Bryan Jones: Yeah. And for me, it’s that you’ve got to care about it.
Russell Craxford: Caring is important though. Definitely. And I said, I asked why I partly why I like that that matters. Cause it’s, it’s kind of figuring out what matters, who it matters to, why it matters. Lovely tester questions. I love trying to figure out, you know, what’s important.
I’ve done the testing life cycle. I’ve done. The role where I’ve discovered books and realized they don’t matter. You have to question why did I even run that test? That test is going to just show something that doesn’t matter. It probably wasn’t worthwhile executing or doing or conducting
Bryan Jones: or even writing in the first place.
Russell Craxford: Yeah, well, luckily, most of these cases, it’s exploratory. So I haven’t gone to that level of effort. But when you start taking note of them, this is one of the skills of testers. It’s about filtering. It’s about finding out what matters, because if you could write software, it’s got billions of books. I don’t know many software that doesn’t, but it’s about what really matters, because a lot of it is insignificant, immaterial. Yeah.
Bryan Jones: And I think that definition of quality is now becoming more and more important. Especially again, going back to the AI, the way AI is coming in, what is good and what is not good.
Russell Craxford: Yeah.
Bryan Jones: How you define that, and that is the difficult question when it comes to quality. Any of the systems that we deal with, but especially when you start dealing with AI ones.
Russell Craxford: Yeah, I must admit, I think from my experience of testers and other things, I think there’s a chunk of the testing community or family that will struggle with the less um, black and white sort of true or false nature of AI, um, things.
Bryan Jones: I’m inclined to agree.
Russell Craxford: But it’ll be interesting because every group of community obviously has got different skill sets and different things.
But I’ve certainly experienced a lot of times We used to, we were originally bought up on this concept of true or false. So I think there’s a few testers that’s going to struggle with that mindset shift. Yes. And some that will love it and embrace it. You know, some that love the gray, I love that.
So agile, adaptive, agree, discuss, collaborate, figure out what’s there. And there’s some that stuck thinking true or false writing kind of just assertions, so to speak.
Bryan Jones: More of an exploratory approach rather than a deterministic approach.
Russell Craxford: Yeah, I think the ones that embrace that exploratory, that looking, that inquisitiveness, that curiosity, that critical thinking you mentioned, I think they’re the ones that are going to probably excel a lot more in the future with the sort of AI and the way in which it helps us and the way in which we’re all probably going to end up having it in some form in our tools.
Even if it’s just for the buzzword, marketing, let’s be realistic, there’s a few companies out there that are doing it for that sake. Absolutely. And, ideally for the ones that are doing it for actually bringing value, because it can and does. You know, it’s all about making lives better, making things easier.
Software is about solving problems for someone. You know, you don’t build a software that doesn’t solve someone’s problem. It’s pretty much never going to work, never going to have value.
Bryan Jones: And that’s what it comes back to, the people again. It’s people’s problems, you’re doing things for people. People are using this, the users are people.
It’s all about people when it comes down to it.
Russell Craxford: Yeah, people build it, people work together, even the AI systems are built by people. And so on, and clever people at that at times. But yeah, it does always come down to being people based process, people based solutions.
Bryan Jones: Which is quite ironic because I went into doing IT because I didn’t want to have to deal with people.
Yeah. And now I spend all my time
Russell Craxford: dealing with people.
It’s interesting, teamwork, collaboration, that is actually software engineering these days. It is. Certainly in the Agile world, even not in the Agile world, the idea of being kind of a lonesome person in a room with a computer typing on the keyboard until midnight, it’s not really the modern world in which we work in.
Bryan Jones: And if you are, then the chances are you’ve got Slack running at the side, or Teams, or both. Yeah,
Russell Craxford: well quite, yeah, you’re part of a community, even if it’s just a product community. So yeah, you succeed much better when you’re working as a team to do this sort of stuff.
Bryan Jones: Indeed.
Russell Craxford: But yeah, okay, I’ll probably wrap up on that now.
But um, thank you very much for joining us, Bryan. It’s been interesting talking about what’s going on, things like that. AI in the future as well.
Bryan Jones: Thank you for inviting
Russell Craxford: me. Bye bye.
About Me!
Bryan Jones is a passionate QA professional for 35 years and people-centred Quality Leader working for 2i as a QAT. He is a podcast host as a popular conference presenter.