In today’s episode, we go deep into the career of veteran industry innovator and leader John Thompson, Global Head of Artificial Intelligence & Rapid Data Lab at global biotechnology leader CSL Behring. John and Jesse discuss the art of the analytics team, how a leader can sell their data-vision to their C-Suite, and the potentially bright future ahead for data governance and monetization. John also shares more about his book Building Analytics Teams: Harnessing Analytics And Artificial Intelligence For Business Improvement, as well as his upcoming release, The Future of Data: What Happens to Your Data.
Welcome to the Soda Podcast. We're talking about the Data Dream Team with Jesse Anderson. There's a new approach needed to align how the organization, the team, the people are structured and organized around data. New roles, shifted accountability, breaking silos, and forging new channels of collaboration. The lineup of guests is fantastic. We're excited for everyone to listen, learn, and like. Without further ado, here's your host, Jesse Anderson.
Hello, and welcome to the Data Dream Team Podcast. My guest today is John Thompson. I've known John Thompson for quite a while. He is the Global Head of Artificial Intelligence & Rapid Data Lab at CSL Behring. Before that he was at Dell and Gartner. He's done quite a few interesting things. Welcome to the show, John. Would you mind introducing yourself a little bit more?
Thanks, Jesse. Happy to be here. Thanks for the invitation. Yep. Been involved in the advanced analytics and data field for 37 years. Started out my career as a programmer and an analyst, and then moved over quickly into data and analytics, building data warehouses, business intelligence. Moved over to the technology side, helped build out predictive modeling market language, and then have been doing advanced analytics for quite some time. Been part of 20 different industries and built at least 60 different predictive analytic applications that have been put into production at this point.
So it's fair to say you have just a little bit of experience on this. And one of the things I really enjoy about you - in fact, I think that was how we first met - was around your books. So you've been kind enough to share your experience through books. There's quite a few analytics books out there, but you took a different bent, and that was what I appreciated about your book. You talked about - how do we do teams about this? How do we actually create teams? So your book is called Building Analytics Teams: Harnessing Analytics And Artificial Intelligence For Business Improvement. So, tell me more about the thesis of this book.
Yeah, that book came out about two years ago, just over two years ago now. And I was really interested in the topic because there were a lot of people talking about analytics from data systems, mathematics, algorithms, those kind of things. But nobody was really talking about the difference in analytics teams and why analytics teams are different. You know, when I talk to C-level executives, they always thought, oh, an analytic team is an IT team. They're developing forms and databases and things like that. And they're not. They're completely different. So, an analytics team is more like an artistic team. So I wanted to call out all the different perspectives of why an analytics team was and is different.
So tell me more. Why do you think that they're different?
Well, I've managed quite a few of them over the years, and one of the things that's dramatically different from just a regular development project - putting in a CRM system or a Salesforce.com implementation of a platform or something of that nature - is that those systems are pretty straightforward. You start at A and you end at Z, and the progression is very linear.
In an analytics system, it’s not linear. It's very iterative, it's circular, it's recursive. You may be trying to do things with data that no one's ever done before. You may be trying to integrate 4 or 5, 12, 20 different data sources, and no one knows how to do it. So you're doing that for the first time. You may be trying to apply all sorts of mathematical techniques and algorithmic techniques to try to understand and learn about the data, and no one's done that before.
So you may be doing many, many different things that are new, different, novel, and innovative. And it's difficult to impossible to tell someone that this project is gonna run for 13 weeks or 12 weeks or 26 weeks or whatever it is. I can tell people generally what we're going to do, how we're going to do it, and what we think we're going to achieve, but it's very difficult to give them a date, and many people do want a date - it’s gonna be done on June 10th, or it's gonna be done on August 15th. And I can say, well, we'll have it done within some time within this three month period. And there's a good chance somewhere along the line, we'll come back and say, it's impossible to do, and we'll never achieve it. That's very hard for some people to accept.
And do you have a suggestion about how you should attack that? For example, you can't do an analytic process and say, well, three years from now, five years from now, we'll have something. Do you time box? What do you do there?
Yeah, we do everything that everybody does. You know, we work in nimble cycles. We work quickly. We're very communicative. We always integrate subject matter experts into the process. We're mathematics and data experts. We're not subject matter experts. We don't know supply chain. We don't know clinical data. We don't know those things.
So our subject matter expert teams and our data teams and analytic teams are integrated tightly together. So yeah, we communicate as often and as freely and as fully as we possibly can. And we tell people, yeah, we think this is a six month project and probably it could be done in four, it could be done in eight. So we don't give any hard and fast dates as an end point, but we do give people what we think is a reasonable range. Now, generally what happens is that when we say that, there's some unease or consternation, but generally these projects return incredible ROI. We just did a project, and I explained to the executive sponsor that it could be six months, it could be eight months, it might be nine. But in the end we believe that the return will be somewhere between $24 and 39 million. And they're like, oh yeah, well, that doesn't matter if it's short. Great. And if it's a couple extra months, I don't care.
Yeah. With that kind of ROI, any business person should be looking at that. And I wanna point out to everybody that's listening, this is how you sell analytics internally. You don't say, hey, we're going to do some kind of ML. We're going to do this. We are going to generate 24 million. We're going to do this. That's what perks up these business people's ears. So I think one of the things that's important is for people to hear how you position this so that you can continue your work, so that you can do your work. Really important to understand that.
You also have another book coming out in October/November timeframe coming out from Manning. It’s called The Future of Data: What Happens to Your Data. Could you give us a glimpse of what's going to be in this book?
Yeah, absolutely, Jesse, and thanks for the opportunity to talk about it. I'm always, like you, always thinking about the next book. So when I was writing Teams, I was thinking about, what am I gonna write next? And it came very clear to me talking to my sister and my brother and everybody that I talked to at the gas station, in the grocery store. I'm a very gregarious person. I talk to everybody that I can come in contact with. And whenever I’d bring up data, it was pretty clear that not many people understood what happens with and happened to your data in the current environment that we live in.
Now, I've been focused on data for nearly 40 years. So I thought, well, I have a perspective on this. I have something to say. So the book is really for every person in the world, every person that's connected to the internet anyway, and every person that uses Amazon or browses Spotify, or heaven forbid, uses Facebook. You know, it really talks about - this is what happens to your data. And that's the first third of the book. The middle third of the book talks about all the different rules and regulations and laws that are coming to be about data ownership and data privacy and data monetization. And then the last third of the book is what you can do as an individual to get ready to own and proactively manage and monetize your data. There's very few people that understand that in three to five years, you will be able to set the price of every piece of data you've ever generated.
So you've talked about those different parts of the book. What's your favorite part of that?
I'm intrigued by all of it. The first third of the book was really just explaining the history of data. Why do we live in the world that we do? Why do we have the data ecosystem that we have? The middle third is very exciting. What's going on right now? And then I think if I had to say, you know, what is my favorite part, it's the last part. It's enlightening people as to hey, this is what's gonna happen. And some people I've talked to who are in the know say, oh, you know, everybody just wants free email or everybody wants free search or whatever.
And I said, well, if you give them the choice of doing whatever they do now and paying for email and paying for search, and in the end, having another thousand dollars in their pocket as part of their data dividend, I think that's a decision a lot of people will make that they want the money, they want to pay for those services and they want that additional money to use it however they see fit. Some people will say, sure, I don't care, you can use my data in any way you want. But I think many people will want that freedom to make the choice themselves.
And do you come to any conclusions in the book that we should be worried, that we should somehow lock down our data? Or should we read the book and find out?
Well, of course, I'd love you all to read the book. That would be great. You know, I'm sure some people will pirate it. Some people will buy it. Buy it, please, if you can. You know, I'll tell you, the bottom line is, yeah, you can stick your head in the sand. And then people will, companies will do whatever they want to do with and to your data. And to you, ultimately. But if you're aware, awake, engaged, and interested, the future's gonna be pretty bright.
And one of the things I think you touch on probably in the book is that topic of GDPR, which you say is one of the high points and one of the things that we should be looking forward to and has helped out quite a bit. Do you talk about that more in the book? And what's your view on GDPR?
I do. I talked about GDPR quite a bit, and it's six years on. It's been a huge success. Anybody that says it's not been a success is not paying attention. That's all there is to it. You know, when GDPR first came out, I was apprehensive like most people in the data business. I thought, boy, this is really gonna be problematic and troublesome for us. It's not. It really gives people the ability to access their data, manage their data, delete their data, to have the right to be forgotten. There's no shortage of data.
We as data professionals don't have to worry about that. There's plenty of data in the world for us to use. So GDPR six years on has spawned a whole raft of GDPR-like regulations in the United States. California, Massachusetts, and five other states have the same laws on the books. The Data Act that was just passed in the EU and the Data Governance Act that also just passed in the EU are the future of what's gonna happen with data. And that's what I talk about in the book: how to access, understand, control, protect, and monetize your data. That's where we're going.
Now, some people will say, oh, I don't like that, and I don't want to be part of it. And you don't have to be. That's fine. You can opt out, and you can live in the same world that you've lived in your entire life. But if you wake up to it, you get involved, you'll be able to make different choices, and you'll probably make more money.
Did you make any choices using GDPR?
Yeah, well, I made those choices before GDPR, Jesse. You know, I'm a data professional. I know what's going on with data. So I've never clicked on an ad. I've never bought anything off an ad. I turn off all location services. I block every ad. The only data I ever provide to anybody is the bare minimum I need to to execute a transaction. So those were my choices well before GDPR.
Yeah. It's interesting as I kind of echo you in that, when people ask me what I do, and then they start asking me about advertisements and GDPR, I say, well, in some ways, the cat's out of the bag, but at the same time, EU residents get to have this blanket privacy that allows you to keep some amount of your privacy intact, as it were. So we talked a little bit about how you're currently at CSL Behring. Could you tell me more about that company and what it does?
Yeah. CSL Behring's a very interesting company. I've been there for four years. As you mentioned, I'm the Global Head of Advanced Analytics and Artificial Intelligence. And the company is 100 plus years old, maybe 106 or 107 years old. It's an Australian company, and everything about CSL is biopharma. So the raw materials come from humans, human plasma. We have over 300 plasma donation centers across the United States that human plasma is then processed or fractionated into therapies for rare diseases like hemophilia A, hemophilia B, primary immunodeficiency disease, and other diseases like that. So it's a very intriguing and interesting company.
So you get to walk around saying it's made from people. It’s people.
It's people, it's Soylent Green. Yes, exactly.
Well, you mentioned the company being 100 years old, and I find that there's a huge difference between, let's say, a three or four year old startup, even a 20 year old company. And then there's the 100 year old ones. How have you found that to be different?
It is a well established company. Very profitable and fast growing, you know, for as old as it is. It is a manufacturing company at heart. It is a healthcare company at the core. So there's a lot of slow moving parts in governance, and for good reason. You know, we're dealing with humans on the front end, on the input side, on the raw material side, and we're helping patients on the back end side as well.
So, you know, every regulatory body in the world - and that's not hyperbole - every regulatory body in the world has a say about what CSL does. Every healthcare regulatory body, the FDA, the German GHA, the Australian equivalent, the Japanese equivalent, they all have regulations that apply to CSL. So CSL is a very thoughtful, considerate company. So in that construct, you need to pick and choose where you can do things rapidly and quickly and in an innovative nature. Some places they go a little bit more slowly and for good reason.
Yeah. You actually brought something up that I hadn't either thought of or realized, in that you're going to have to deal with the equivalent of worldwide HIPAA regulations. I know I had to deal with that in the US, and there's probably those sorts of regulations in other countries. Could you talk about how you deal with that?
Yes, you're absolutely right, Jesse. We deal with all the health regulations in the United States, in Europe, Western Europe, all across Asia, Australasia, AsiaPac, however you wanna categorize that part of the world. And what we do is, our view is that we go to the most restrictive position possible. So in our analytics, we strip away all the identifying information. We don't even bring that information into our analytical systems. So we never have anything that identifies a patient, a donor, a person. None of that ever comes into our analytics.
And then we de-identify and de-dupe and de-aggregate data in the most fundamental way that we possibly can. So we work really hard to make sure that no one can question that our analytics are anonymous and based on factors and features and data that is aligned with all the laws and regulations.
That reminds me of something interesting that happened in the state of Nevada where I used to live. They started releasing health data. And what was interesting is there was only one 60 year old person in the government at that time, and that was the governor. So they released all this anonymized data, but it was grouped by age, but there was one in that bracket. How do you go about dealing with and making sure that there isn't that one person here in this outlying bracket that you can identify?
Well, that's a great question. It's something that we deal with all the time, because we're dealing with rare diseases. So, you know, the populations we work with are 10,000, 5,000, 2,000, you know, and if you go down far enough, yes, of course there's one female 32 to 50 in this zip code. So you can't go down that level. You can't go that low, because it becomes identifiable, as you said. So, you have to be careful of the numbers that you have, and you keep them, you know, it would be great to go down as low as possible, but at some point you look at it and go, uh oh, we're in the danger zone. We can't go that far down. So, we work with that all the time, and we make sure that the information is aggregated and summarized at a point where it is anonymous.
And is that done technically? Is that done through review of the data? What is that business process or technical process?
We do it through the review of the data, and we do it in conjunction with our subject matter experts, with overview and oversight from legal and compliance.
Okay. And there's another part of your title that was the Rapid Data Lab. What is the Rapid Data Lab?
That is a construct that we use to move quickly. So people come to us from supply chain or clinical or plasma operations or pricing or wherever, and they say, hey, I've got an idea for a project. And if that project is interesting and feasible and can return a decent return, as we talked about earlier in the discussion, to the organization, we fast track it. We put it into the Rapid Data Lab. We have a group of data scientists dedicated to that. We have contractors there that help us do the integration and mundane more data engineering work. And we move as rapidly as we can to move those from concept to pilot to production. So, we can move in a matter of days and weeks rather than months and years.
So it sounds like you've set it up more in a consultative way, that there's this kind of internal consulting that happens.
Yeah, that's us. Basically, that’s the way people talk about us is that we're an internal consulting shop. So people show up, they say things like, hey, we'd love to do this. And we assess it, and sometimes we come back and say, well, that's not feasible. We don't have that data. Someone asked us a couple years ago that they wanted us to anticipate what a certain class of people would do. Sometimes we can, because we understand their behavior and we have enough data about the way they act and the things they do that we can. This was around things that they might do. And we didn't have enough data about them as people or their behavior, so since we didn't have any data, that's more like magic, and we can't do that. So we didn't do it.
Now, if you were to rewind back to pre-Rapid Data Lab, what was the impetus to create this Rapid Data Lab?
Well, we were doing basically the same thing. We just didn't have as many resources. So when we explained to the organization what we were doing, but we were gated by funding and resources, they said, oh, well, we'll give you more funding and resources and you call it the Rapid Data Lab. And I was like, okay, fine. So it was a way to accelerate something that we were already doing.
And was that at the C-level, board level, or how did you get that extra funding? The reason I dive deep into this is this is a very common question and you actually have done it. So I'm loving your experience on this.
Yeah. You know, I went through the VP level and it got to the C-level, and the C-level executives approved it. And it's not that much money, to tell you the truth. It's a rounding error in the scheme of CSL's budgets.
It's a rounding error, as you mentioned, for the people, but the amount of value being created, once you free that up, it's non-trivial. It's significant.
Yeah. I mean, most of the projects that we talked about that we undertake are on the same scale that we talked about earlier. These projects take six weeks, eight weeks, three months, something like that. And when they're put in place, they return tens of millions of dollars a year.
So let's say a person is listening to this and they're saying, oh, I want to do that. What advice would you give them to be able to do this successfully?
You already gave them some of the advice early on that everything that I do is denominated in dollars, pounds, yen, euro. I never talk about speeds and feeds and data. I do that with my team, of course. We’re data scientists. We talk about those things in our project meetings. But when I'm talking to business people, it's all return on investment. It's you're gonna give me this and we're gonna give you that. It's a give to get situation. So you already said that that's every conversation I've ever had with a VP and C-level executive. And then after that, it's about being nimble.
I never use the word capital A, Agile. I can't stand that word, but we work in a rapid manner and we return results. And one of the things that here's something for every data scientists out there is that when you're asked to do something that you think might be really hard and challenging, the first thing you should do is bring all the data together and do an EDA, an Exploratory Data Analysis - basic statistics, descriptive statistics - and give that back to the executives and the managers you're working with. That alone will give you immense credibility and give them a better understanding of the business that they're operating today, as it really is today, as opposed to how they might think it operates. Once you get that credibility and are on an equal footing with an executive or a manager or a subject matter expert, you’ll have all sorts of leeway to do really interesting work thereafter.
All very, very good tips from John. So whether you're a manager, individual contributor, these are the things to do. This isn't just John saying this. This is my experience as well. So really do listen up on that and become practiced at it. Tell me about your team. How big is your team and where are they distributed?
You know, the team flexes and flows. I think we started out, we were three. I think we're up to like, 50 at this point. Most of them are in the United States. Now, of course, they're augmented with consultants that are scattered all over the world, really. We have a number of people in the AsiaPac region. We have some people in Australia, we are an Australian company, but the US center of operations is outside of Philadelphia and King of Prussia.
Okay. And how do you deal with people being all over the place? Sometimes I talk to people and they say, it's impossible to run an analytics team that's worldwide. What do you say?
I say it's a joy. I think it's fantastic. You know, the pandemic has been horrible, and it's been game changing in a not good way for many people. But for analytics teams, I think it's been a gift. If you read the book, I talk about many of the characteristics that we as analytics professionals have or are prevalent to us, I guess is a better way to say it. And a lot of us don't want to go to an office. A lot of us don't want to sit in meetings. Many of us are introverts.
You know, I don't have any trouble with my people never leaving their house. We connect with executives, we connect with subject matter experts, we connect with each other, we talk all the time. I have a biweekly one on one with everyone on my team, and I have a weekly meeting with everybody, and I look at every project.
So we're talking all the time. We know if something's failing, we know if something's going really well. So do all of our executives, too. And we've generated some really interesting and engaging applications for a wide range of functional units inside CSL. And I've never had anybody come back and say, you know, I really need to sit in the room with the analytics team. I've never heard that feedback, 'cause everybody's very good at using the platforms. I'm not sure where you are, Jesse, and I think the last time we talked to you, you were in the UK. So it doesn't really matter. I think this environment works as well or better than it did pre-pandemic.
Yes. Last time we talked, I was in the UK. I'm in Lisbon now, back home in Lisbon, Portugal. So yes, I would agree with you. I think that it comes down to communication and how well you can communicate. So if you couldn't communicate well before the pandemic or before remote work, guess what? It becomes worse, but you can actually fix this. You can change it. There was one thing I heard you say in your introduction of yourself and you were talking about the number of industries that you've been in. I think you said 20 different industries. And I always find it very interesting to talk to people who've worked in that many industries and have went so deep into that. Which is the most interesting industry that you've worked in?
I think there's a couple. That's a great question. And I didn't know you were in Lisbon, so that's new for me. That's cool. We'll have to talk about that someday. I grew up working in the auto manufacturing industry. My dad had an auto repair shop, and then I worked in machine shops. I've always been drawn to manufacturing. I love manufacturing. I think it's really fun. So manufacturing from an industry is first and foremost where I'm always gravitating to, and then consumer packaged goods. I did 20 years in the consumer packaged goods industry, touching on retail as well. And then third, I'm always intrigued by financial services. We did a lot of work for Visa on fraud and fast moving transactions and things of that nature. So those are probably my top three.
Okay. And what was your biggest insight? Have you ever been able to take an insight, you said your love of manufacturing, here you are at a different type of manufacturing, a human manufacturing company. What insights have you been able to take over that?
You know, many of these industries, automotive manufacturing, cereal, you know, Kellogg's and General Mills and folks like that, and CSL, they're all working through efficiencies and effectiveness. All those measures translate across all those different industries. Now, as I said before, CSL is unique for all the compliance and health aspects that have to be taken into account, but there's parts of CSL that are just like Kellogg's, that are just like GM. So there's commonalities across all of them. And it's usually around efficiency and effectiveness.
Interesting. You mentioned before you had some very poignant views on what data science and what analytics was. Do you view data science and analytics as more of a creative or a technical role?
Oh, it's definitely a creative role. There's no doubt about it. It's surrounded and bounded by technology and it uses and leverages technology, but data science is a creative endeavor. There's people that talk about auto ML is gonna take over, the predictive applications and analytics and things of that nature. Yeah, sure, it’ll take care of some of the rote, routine, mechanical things that we do, which is great. Take all that stuff away, we don't care.
But I tell you, the number of people who are experts at feature engineering I can count on two hands. It is a creative endeavor and I'll go a bit afield here, Jesse, if you'll indulge me. There's some work by a gentleman named Judea Pearl, who's creating new algebra around causality, real causality. We as data scientists don't do very good with causality. We do really well with correlations. We're very good at that. But this new algebra allows us or the promise is to allow us to search a feature space 2,000, 3,000, 5,000 features and come up with the features that are truly causal. Now that is game changing. That will change the entire world.
So to get back to where we started, this is a creative endeavor. And you need to work with and treat your data scientists as if they're somewhat artistic in their approach. And I think if you do that, you will have much more success. One of the things that I can say is that I've only lost one data scientist in the past four years. I don't think there's too many teams that can say that.
Yeah. I would agree with you. Usually there's a pretty high turnover in data scientists. They get poached. And you attribute that to you're allowing them to work creatively and that's what keeps them?
And I pay 'em well.
That always helps, too. What do you think this manifests as? Let's say, somebody's listening to this and they're saying, "They're on a technical team. I treat them very technical, and I expect outcomes." What advice do you give them to say, no, you really need to do this creatively?
Well, it's really up to what you want. If you're all about the speeds and feeds and making sure that your AWS instance is running efficiently and your database is full of current data, then you're doing a great job. You won't hold onto your people and you won't deliver much value to the organization, but you'll go to bed feeling satisfied. There's a great question that people ask all the time is, "Do you want to be happy or do you want to be right?" I tend to want to be happy.
So, I want to treat my data scientists with the utmost respect, care, and give them the freedom to be who they are. Not every data scientist wants to work on neural networks, not every data scientist wants to work on text problems or NLP. I listen carefully to what our data scientists want to do and the skills they want to learn, and I'm a partner in their development. So if they want to just focus on advanced statistics, I'll give 'em every problem I can that's focused on advanced statistics. If they want to go from learning NLP to neural networks to clustering and classification, I'll make sure that they're on a string of projects that develop them as technical professionals and as possibly managerial candidates as well. So, if you're not managing these people with the utmost care and are interested in them as technologists, as creative professionals, as family members, as the people who they are as the whole person, you're not gonna hold onto 'em.
And I didn't ask you before: what is the management structure? Do you have line managers or are there direct reports to you?
Everybody directly reports to me.
Okay. 50 direct reports. That's a decent amount of direct reporting. That's a lot of meetings right there.
It's insane. Yeah. It's crazy. And you know, some of those 50 people are contractors, so they don't report directly to me; there's a manager that manages that offshore team. But everybody who's a full-time employee reports to me.
Okay. Yeah, my hat's still off. That's a lot of meetings.
Yeah, it is. And I love meeting with people.
I could tell. It's coming across in this meeting. I mean, the interview is basically a meeting of back and forth, learning about somebody. So one of the things you really recommend in your book, the book that's out right now, is having a Center of Excellence. Could you talk more about what a center of excellence is and how you would put one in place?
Sure. Center of excellence is a great construct. I'll take it a little bit further afield there. We have the center of excellence - is the data science group that we've been talking about. The Rapid Data Lab and all those things are sub-constructs of the center of excellence. Then there's a community of practice around CSL that is a few hundred people that are interested in analytics. Those people could be process scientists or bioinformatics people or whatever, but they're interested in analytics. So there's the center of excellence, there’s the community of practice.
We break the community of practice into 15 different special interest groups. Those 15 special interest groups are run by someone in the center of excellence. So those people in the center of excellence are connected to the community of practice by the SIG that they own. So those special interest groups meet every quarter - I think most of them meet every quarter, some meet monthly, I think the stats SIG meets every quarter, and they talk about things that they're interested in, problems they're facing.
So the SIGs talk about individual subject matter problems. That gets brought into the center of excellence by the data scientist. We talk about the business problems, the technical problems, the algorithmic challenges, and then we build services and approaches and talk to the broader community about how we might be able to work together to address those challenges. So I think it's a three-part ecosystem that works exceptionally well.
You've been at CSL Behring for over three years. How long did it take you to set up that center of excellence?
About six weeks.
Was it just a trivial thing of more of an organizational change, or was it hiring people? What did you have to do?
Yeah, no, it was when I came in. When we were talking in the interview process, they asked me what I was gonna do, and I had written the first book, and I was pretty clear about what I wanted to do. The first book Analytics: How to Win with Intelligence is basically a primer for non-technical executives to understand how to set up a center of excellence, a community of practice, and SIGs. So I had written the book and that's what I was gonna do, and that's what I wanted to do. So when they hired me, I just did it.
Yeah. I've wondered what it's going to be like if I ever do an interview and somebody says, what are you going to do? Well, I wrote a book about it. You can read the book or I could do it. Is that kind of how it went?
Yeah, that's kind of how it went. I mean, I showed up at every interview with the book and I handed it to 'em and I said, this is what I'm gonna do. And they read it and they're like, okay, go.
Yeah, that's a pretty good resume right there. So, congrats on that, John.
I didn't think of it that way, Jesse, but yeah, I guess so.
I've always told people a book is your best resume or best business card. In fact, that was some advice given to me by one of my first editors at Pragmatic Programmer, and he was right.
Yeah.
You've mentioned a little bit before about this visceral hate you have for Agile. Talk to me about that. Let's do some therapy about this.
Well, Agile is the antithesis of analytics and how analytics works. There's a lot of people out there that love Agile and want to do Agile, and that's great and good on you and fine. My personal opinion is, Agile is a babysitting construct. Data scientists don't need to have babysitting, they don't need micromanagement, they don't need to have someone asking them every day, hey, did your algorithm learn better? Did your gradient descent go any faster? It is just the wrong way to go about doing analytics. It is the opposite of a creative team management structure. So at the heart of it, that's my view. And I've debated this in multiple forums around the world, and almost every time I've done this in a public forum where there've been physical people present, at least five people come up to me and go, “yeah, that's true, that's exactly our experience as well.”
I would agree with you on this for Agile, for data scientists. And that's been my experience as well. I wouldn't say it's antithetical, I think it just slows them down too much. And it's a construct for something that data scientists don't fit in. You're trying to fit a square peg in a round hole. And the question to you is, so what does work? If we can't use Agile, what methodology do we use?
In the book, I talk about a personal project portfolio. And I think that works pretty well, where you give each data scientist one or two major projects, a major project to be defined as something that's somewhere between six months and two years in duration. Then you have a set of three or four minor projects that are maybe weeks or a couple months in duration, three or four months maybe, in duration.
And then you have service requests. We all get service requests. You know, executives come down all the time and say, “Hey, I've got a board meeting in three weeks and I need to understand the elasticity of demand of this certain population.” Okay, fine. We're gonna do that for the executives. That's the way it works. We all have fire drills.
So I give each - if I can, if I have that many projects - I give each data scientist their own personal project portfolio. They own it. It's autonomous, they're responsible for it, and they have the authority to do whatever they need to do to get it done. Now, we'll have meetings. As I talked about, we have lots of meetings, lots of discussions. And sometimes I don't hear about a project for two or three weeks. Now, the reason I don't hear about it is that either they've got other priorities or they've run into a snag - the model doesn't work, the data isn't working, doesn't fit, isn't being integrated. And generally, if I haven't heard about it in three weeks, I'll ask.
If not, usually what will happen is a data scientist will come back and say, you know what? I ran into what I considered an unmovable obstacle, I let it lie for a couple weeks, and when I was out for a walk with my dog, I came to a solution. So that's the way I do it. I give them a body of work, I make them accountable for it, and I let them execute it however it works for them.
I like that approach. That's a good approach. So continuing on, you obviously said that Agile doesn't work for analytics teams. What else doesn't work for analytics teams?
I think micromanagement doesn't work, I think distrust doesn't work, I think trying to be the smartest person in the room doesn't work. I have a group of people that work for me who are vastly smarter than I am. I'm always working to be the least intelligent person in the room. I'm hesitant to say, I'm the dumbest person in the room - that sounds like self-loathing, but that's not true. I really love to have a conclave of very intelligent people going at problems from a wide range of different perspectives.
I like that. Let's talk about artificial intelligence. Do you think that there's too much hype around AI?
Well, yeah, that's an understatement, Jesse. Absolutely. As you know, yeah, everybody’s talking about AI. I was on the phone the other day talking with two people that couldn't be more different from each other. One's a good friend that I've known for 40 years who's in staff augmentation, and he's very good at what he does. And he doesn't even know what AI stands for. And he asked me, in the middle of the conversation, he goes, “what do you think of AI?” And I just started laughing. I said, “do you really want to have that conversation?” And he's like, "I do, 'cause I'm absolutely befuddled. I have no idea what it is.” So we had a good 20 minute conversation and he walked away from it and he goes, “oh, now I really feel much better about it. And I get it.” The next conversation I had was with a guy who was just an absolute brilliant practitioner in all sorts of techniques around AI. So, in the general population, there is that gambit, people who know what they're doing and doing it every day and doing really cool things, and then people who are running businesses who are very successful executives who have no idea what this stuff is. And there's a lot of confusion. So all we can do as practitioners is, it's kinda like hand to hand combat. Either write a book like you and I do, or go out there and each conversation you have, try to leave that person a little bit smarter about AI.
What definition of AI did you give your non-technical friend?
I told him it was software and models and approaches and algorithms that bring data in and learn from the patterns in that data and then are able to look at new data and predict and prescribe what will happen in the future. He's like, “oh, okay. I got it.”
That's a good succinct definition. I appreciate that. One thing I know you've talked about before is you think it's time to move beyond neural networks for AI. Could you talk more about that and why you think that?
Yeah, well, everything that we've done in the last 15, 16, 20 years has been neural network based. All the big breakthroughs you hear from Yann LeCun and Jeffrey Hinton and others, it's all neural networks all day all long. And they've done really great things. I don't mean to diminish what they've done. They're giants in the field, and they have brought us far in our understanding of data and what neural networks can do in the innovations they brought forward. So, kudos to you, gentlemen, but we need explainable AI.
We need that to work against neural networks and all other techniques. We need the work in causal algebra to take us past that. And I think one of the things that we'll see beyond any of these individual algorithmic approaches is ensemble modeling. We've had some real success in doing ensemble modeling, and that's bringing together many different datasets and many different models and stringing them together in a logical progression. So I think those things are what we need to see and what we will see in the near future.
Tell me, why do you think explainable AI is so important in this?
I think it's very important because neural networks are hugely valuable to us, and they're ubiquitous, they’re pretty much everywhere. But in the most regulated fields like healthcare and pharmacy, where I'm at now, and financial services, we can't really use our most powerful techniques. And neural networks by and far are our most powerful techniques right now. We can't use them because we can't explain to the regulators that it's fair and that it's logical and it's ethical. So if we wanna use our most powerful techniques in all the different industries where they're appropriate, we need explainable AI.
And I was having this conversation with somebody in a conference that I keynoted at. And I think that we're going to have, as probably part of some new GDPR-like law, a law requiring explainable AI. We have similar sorts of things in the US for financial services. Why were you declined for this credit? You have to explain that AI, that model. So it's going to be required. And I think we're going to have to go down that path. You've mentioned Judea Pearl's causal analytics. You just touched on it. Why do you think that that is going to be such a game changer?
Well, if it works - we don't know if it works. Dr. Pearl is doing some really intriguing work, and there's a couple companies in the UK right now that are trying to productize that algebra into software that can be used either in the cloud or as a SAAS offering or as an on-premise tool or however they package it. They'll come up with however it works. It is the next evolution of where we need to go. And if it works, it will be a tool that data scientists use to actually get to true causality.
Like I said before, we can't do that right now. We do correlations and we do that very well, and some people do think it's magic. They're like, oh my gosh, this really works. And it is, and it's great, but mathematically, we can do more, and we should do more. And if causal algebra is that next step, the people like you and I who understand it will be working until we're... until the days are gone.
I can only hope so. Sometimes people ask me if the days are numbered for data science or data engineering, or what have you. And I say, no, there's always something over the next hill. We may not know exactly what that is yet, but there's something that's going to change, and we're going to need these smart people in rooms working on this. And they're just going to, as much as people may sit there thinking, oh, $30 million of ROI, okay, John must be done, they're probably gonna show him the door next year, nope. There's going to be similar amounts of ROI out there. This never goes away. No, this is going to be a constant optimization.
Yeah, you're absolutely right, Jesse, and I get that question, too. So we must be talking to some of the same people. People ask me, like, when are you gonna be done? When is it gonna be done? And I say, well, it's math and it's data. So it's never done. And the reaction is either elation or depression.
I think the answer is it's done when you retire.
Yeah. It's done for you when you're done. Is it ever done? No.
No. You’re talking about some of the people. Yes. I would talk to people and they'd think, okay, well, I can just do a contract with these data engineers and these data scientists, because I just need them long enough to spit out this model and show them the door. And hey, you're not going to see the highest ROI on that sort of thinking.
Yeah. You're absolutely right. And one of the conversations I have with people is, is this a project for you? And that's what you've described, 'cause they want a number. It's seven. Okay, fine, I'm done, that's a project, it's over, done, done, done. But then I asked them, I said, do you really want your business to constantly improve and get better based on the data and the traction you have and what customers and patients are doing? And they're like, oh yeah, that sounds really exciting. And I said, well, that's not a project, that’s a program. That's something that we'll be doing from now forever. And most of the time an executive will walk away and say, ooh, I just want the project. Just gimme the number. And then they'll come back somewhere between three and six months later and they'll say, you know that program you talked about? Can we talk about that more? That sounds really intriguing.
That brings me to a thought. So let's say I'm a board level person, I'm a C-level person. And it sounds like John Thompson is very happy at his position. How do I find your clone to fill a position at my company?
That's a good question. You know, LinkedIn is where I hang out all the time, as you know. I'm always posting and talking about it. And probably the easiest thing to do is to reach out to me and say, who do you know is like you? I get that all the time. You know, people come to me and say, hey, I've got this role. And I'm like, that's probably not me, but it might be Jesse. Or it could be Bill or maybe Jack or Tim or Judy or Fred or Vasia, you know? And I spin off these opportunities to people and I say, hey, take a look at this. So you can always come to me and ask.
Yeah, yeah. You've realized that it's not a zero sum game. You giving out something, it's not any skin off your back.
The success of Procter & Gamble doesn't hurt my successes, doesn't hurt Bank One or H&M or Cadbury Schweppes. They can all win. We can all win.
I like that. Sometimes when I find, after you write your book, you've had some time and some levity on the situation to look back. What's changed in your thinking, since you wrote your last book?
I have been humbled and surprised and pleased with the global reception that Teams has received. There's been so many people that have come to me and said, “oh my God, I wish I had had this book 30 years ago,” or “I'm just starting out now and this book has become my Bible.” So, the whole concept of talking about analytics teams the way that I have done, it was really something that was just needed for a long time. And I just happened to trip across it, so it's been a real honor to be able to contribute to the community in that way.
But no, the general thesis or even some of the recommendations you get, you would still give those same exact recommendations today?
I would.
Okay. Then you've written a book that stands the test of time. I know that's what I tried to do in my book. I tried to say, these are the things I think that even 10 years from now, we'll still build analytics teams this way. In my case, I think we'll still build data teams this way. And I tried to write a book that stood the test of time.
Yeah. And I think you have. I think your book is really great. And I think it does that. And as you and I talked, as we were writing the books, I think we were writing 'em at the same time, that was the concept. That was the guiding principle. I don't think you and I discussed that, but that was, what we both came away with was that I did write and you did write a timeless book, something that isn't gonna change with the fashion.
Now the new book that I've written, I've intentionally written it as a book at this moment of time. So, in five years from now, no one will probably read that book, because the time will have moved on and maybe I'll write version two or something like that. But as an author, as you know, you have a certain perspective you take when you write the book, and either the book is current and topical and for the moment or it's timeless. So I think our two books fit into that timeless category.
We're in agreement, we're both timeless. And where would you like your statue put up? Right in front of the Chicago Bulls Stadium?
Yeah, no doubt. It's funny, I was having that conversation the other day. Someone asked me about that, where would you want a statue put up? And I'm like, I don't care. You know, I have two children. They're gonna carry on and do good things, and that's my legacy. I don't need a statue anywhere.
Okay. Well, I'll start the petition to get your statue right next to Michael Jordan right there at the Chicago Bulls.
That would be funny and puzzling for most people.
What were some of the significant milestones in the past 10 years in analytics and data?
I think we've done a really good job in understanding the value of data, from being completely ignored and just given away and dismissed to the concept of something of data as value. I think if you talk to 10 business professionals, nine of them would say, yes, data is a valuable piece of our business at this point. We've also moved from business intelligence to advanced analytics and AI.
Probably 10 years ago, Yann LeCun and Jeffrey Hinton and their compatriots were all squirreled away in Canadian labs, trying to do interesting things and having some early success, but it wasn't widespread and well known in the world. And it still is gaining traction and breadth and understanding around the world, so the understanding of data, the value of data, has dramatically changed.
Our capability in analytics has changed significantly. And the business understanding of where analytics needs to fit in the organization is evolving and needs to get better, but I do have many conversations with people that say, oh, I report in to the CFO, or I'm under the office of strategy or I'm under the business innovation function. It's not a given any longer that they'll be in the CIO's organization. There's still lots of people that work in the CIO's organization, and that needs to change, and that will change. And I think that's something that we'll see accelerate in the near future.
What's your reporting structure, then?
I do report to someone who is at the title of the CDIO. The Chief Data and Information Officer. So at CSL, I am in the IT organization.
Okay. Now you've talked a little bit about how you've laid out your team, how it's about a 50ish person deep team that's directly reporting to you. Is that something you'd hold out as other people should follow that? Or do you think that there's nuances there?
That's insane. Nobody should do that. I'm just an unusual person, that’s all. I really do care and love every one of my data scientists. And I enjoy talking to them. It's nuts from a management perspective. No one should do that. So do not do that. You should probably, if you have 50 people, you should probably have two or three direct managers that manage those teams. But you know, I'm such an unusual person. I'm a lapsed developer, I'm a lapsed modeler, I'm a data integration professional. I enjoy talking to my team, and I hope they enjoy talking to me. So, we have all sorts of wide ranging discussions about integrating data and modeling and all sorts of stuff. So don't do what I do. It's crazy.
Well, I think anybody who's listening to this will agree with you that you're a different person. So there we go. We have proof, we have recorded proof. One of the things I found interesting about you and what you’ve tried to do is how you bring in and try to foster talent with your interns. Could you talk about that program and why you're so passionate about that?
Yeah, thanks, Jesse, for the opportunity to talk about that. It’s something that I've lived. When I was a young analytics professional, a guy, a gentleman gave me a problem and I went at it with gusto and solved it in the span of an evening. I drank a lot of Mountain Dew and ate a lot of pizza and sat there and developed it all night long. And it turned out to be a groundbreaking application for predicting where target markets should be and what target markets you should use. I didn't know it was impossible. I didn't know it was improbable to do, but I was told that this guy really wanted it done. So I just went and did it.
And I think that's the way youth is, and I've seen it over and over again. And I really enjoy bringing people into the field of data science and having them understand all the precepts we've been talking about, the foundational concepts we've been talking about, and giving them an idea that this can be an exciting, fun, enjoyable career. And most people do pick it up and go in that direction.
We just had a young man come to our team. I had a challenge, a problem. I gave it to a data scientist three years ago and they came back and said, this is impossible. I gave it to another data scientist a year later, who came back and said, yes, this is impossible. I didn't tell him I had given it to someone else. And I gave this to the young man who's now on our team. And he solved it in the span of three months with a really elegant solution. So I love working with younger people. They don't know what they don't know, and they're very good at getting it done.
Oh, that's a good story. I really appreciate that. I could see how that would be useful. It's not, you're not saying, oh, I like those interns because they do the crap jobs that I don't want to do and that nobody else wants to do. That's usually what happens with an intern. And there you are actually giving them useful things to do, so kudos to you.
Well, thank you. Yeah, we had an experience while I was at Dell. We were gonna bring in some interns from the University of Texas at Austin. And everybody was saying, “okay, well, what kind of work are we gonna make for them to do?” And I raised my hand and I said, “why do we need to make work for them?” I said, “we have lots of problems. Why don't we just give them a real problem?” And they said, “well, they're students.” I'm like, “yeah, I know they're students. Give 'em real problems. They're supposed to be smart. You know, if they solve it, then we should hire 'em. And if they don't, well, then they'll have a good experience.” And everyone's like, “wow, that's a really unusual way to look at it.” And I said, “yeah, let's try it.”
I find one of the common threads of the people that I've interviewed is a real focus on diversity. I think one part of that is a diversity in age. There’s diversity on all levels. And that's been a key part of what you've wanted to do. Tell me about how you fostered that diversity and how that's useful for you there at CSL Behring.
Yeah. You know, I'm in the United States, and I think you are an American, Jesse, by birth, but we in America are a little focused on racial diversity, which is an American problem, I guess, or American-centric situation, given our history and those kind of things.
But that's not everything in diversity. As you said, gender diversity, age diversity. I really like geographic diversity. You know, Canadians think different than Americans, different than Mexicans, different than people from Britain and Switzerland and Germany and Australia and Japan. I try to bring in as many people from around the world as I possibly can to look at a problem, because those perspectives and the way people have grown up and their experiences really bring a different view to it.
Now, your teams, no matter how many people you have on your team, you won't be able to have every perspective of diversity, every dimension of diversity. It’s just not possible unless every person you have is really a unicorn in their background experience and where they come from. So what we try to do in addition to having good diversity and a wide range of diversity on our team is bringing in subject matter experts who bring in dimensions of diversity as well. So, we work with internal people, we work with consultants, we try to bring in as many diverse perspectives as we can.
Is it fair to say that Canadians think differently because they've been hit in the head during hockey games?
I think that's funny. I'm not gonna answer that, but, yeah, maybe.
I'm gonna take that as a yes. I think the casualties are high there.
I do think that Judea's work would show good causality.
Okay, good. So I don't get hate mail from Canadians, I love you all. You're all great people and so nice, but you say sorry too much. But I still love you all. What's your favorite role on a data team?
Mine.
Yours? Why is that?
I like being the leader. I like having my fingers in all the pies. As I said earlier, I love talking about data and integration and feature engineering and modeling and the whole thing from raw data all the way to optimization. I get to be part of all of it. I love it.
I'm glad. I'm glad you're in the position that you love the most. Sometimes as managers rise up, there's this point where, down several rungs on that ladder, that's where I was happy and I'm less happy up here. So I'm happy for that.
Yeah. I think that that does happen to people. You know, you in your life at some point had a role that you identified with. I'm very happy in my role, but you know, what do I identify with? I grew up as a product manager. So, you know, I think of myself as a product manager. My job is to bring together lots of different resources and data and algorithms and people, and to do the best I can for the corporation. So what do I identify with? I identify as a product manager.
Could you describe a challenge that you've frequently hit when you're building a data team?
Yeah. I think that one of the challenges that we all face is getting the organization to understand the value and the long term value of data and analytics. You talked about it earlier. Many of the people that we work with want an answer today, this minute, this second, this moment, which is fine. I understand the sense of urgency. I get that, but getting people to understand the programmatic nature and the contribution that data and analytics gives to an organization over time is truly a challenge.
You know, I just remembered there was one part that I didn't ask you, and that was, we focused on your analytics team, your data science, but where does data engineering fit in this?
Data engineering's role in the Center of Excellence. And many of the people that we hire as interns, we bring in data engineering. And they do a lot of the automation work and the data integration, the pipeline building, and then we give them data science work as they're more and more comfortable. It works out really well that way, because as a data engineer, you need to understand the data that you're working with.
So once you understand the data that you're working with, that's a good stepping stone to becoming a data scientist. So a data engineer, in my opinion, is a role on the team that is very valuable and highly prized and well compensated. And it can be a stepping stone into data science if you want it to be.
Yeah, I would agree. Although I would say that one nuance there is, I think that they would go into machine learning engineering more than data science, because I think that if we do machine learning engineers like I describe in my book, they'd be able to sit in both areas and be very comfortable in both areas.
Agreed, absolutely. A hundred percent in agreement there.
Could you share an important lesson that you've learned as a leader?
You know, when early on in my career, I probably wasn't as humble as I needed to be. I was probably a little bit more brash and a little bit off putting to people. So a little bit of humility goes a long way.
Does that come with being a Chicagoan?
I think it comes with age. It didn't come to me naturally. Let's say that.
Jesse
Okay. What part of your belief system do you bring to work?
I try to be an honest, ethical, consistent person every day.
What do you never compromise on?
Honesty.
Can you give me an honest answer of why you don't compromise on that?
We've seen too many people that are dishonest and duplicitous in their approach. And I find that odious, and I don't agree with it. And I think it's something that is despicable. I think you should bring as much honesty and your ethics to work every day.
I would agree with you. I think those are good words to live by. Sometimes people ask you and I have public personas and we have private personas. The people who know me privately, I'm pretty much the same way. It’s a very similar way of, we don't have two faces, we have this honest, the John that you're hearing on this interview is the John I first met years ago when we were talking. No difference.
Yep. I mean, the only time I change is, I think Benjamin Disraeli said, or maybe it was Milton Keynes, I’m getting confused on the source, is that when the facts change, then I change my mind. That's all there is to it. If the ground truth has changed, if I had a misunderstanding in the past or saw something differently, then sure, I'll change my view. But if it's the same, then I'm not changing.
John, thank you so much for being here. I really appreciate it. You've shared some great insights with us. Thank you again.
Jesse, it's been great. I really enjoyed the conversation. Wide-ranging, fun, and you've allowed me to express my sincere feelings. And for that I'm grateful. Thank you.
Another great story, another perspective shared on data, and the tools, technologies, methodologies, and people that use it every day. I loved it. It was informative, refreshing, and just the right dose of inspiration. Remember to check dreamteam.soda.io for additional resources and more great episodes. We’ll meet you back here soon at the Soda Podcast.
