Zhamak Dehghani is the founder of data mesh and author of the book, Data Mesh: Delivering Data-Driven Value at Scale. Zhamak returns to the Soda Podcast. In part one, Zhamak and Jesse will delve into data mesh and how that is redefining how we manage data. They talk through the reality, the dream, and the execution, and bring us up to speed on what’s been happening since they met in season one. Part two will deep-dive into Zhamak’s book.
Welcome to the Soda Podcast. We're talking about the Data Dream Team with Jesse Anderson. There's a new approach needed to align how the organization, the team, the people are structured and organized around data. New roles, shifted accountability, breaking silos, and forging new channels of collaboration. The lineup of guests is fantastic. We're excited for everyone to listen, learn, and like. Without further ado, here's your host, Jesse Anderson.
Hello and welcome to the Data Dream Team podcast. My guest today is Zhamak Dehghani. She is the director of emerging technologies at ThoughtWorks. She is also the data mesh founder. She's also a person who needs no introduction, but Zhamak, go ahead and introduce yourself a little bit more.
Sure. Hi, Jesse. My name is Zhamak. I work at ThoughtWorks, as you mentioned. I've been there for 10 years, and I'm actually on my sabbatical, long service leave, after 10 years of working with ThoughtWorks. Yes, and I have been in the industry for 22, 23 years. Worked in different areas of technology, started as a programmer. I wish I could tell that I'm still a programmer. Probably I don't qualify anymore as I don't push into production anymore, but I work with a lot of organizations around their data architecture and big data solutions these days.
It's interesting you're on sabbatical, because we'll talk more about what this episode is, but a sabbatical is the time to sit back and reflect. And so I'm curious, now that you've had the time to sit back and reflect, what we'll talk about. But before we do that, let's kind of rewind. And you've been on this show last year. In 2021, we published your episode, and everybody enjoyed it. I loved it. There was critical praise all around, and I've put that episode out to a lot of people when they say, “Hey, this is what data mesh is”. And I say, “Well, no, I interviewed her. We did talk about this.” So, now we're gonna continue that discussion.
Yes, yes, yes. I'm excited. I really enjoyed the conversation last time, and I actually enjoyed actually listening to the conversation last time, because when you listen to both people talking at the same time, it's a very different experience and I enjoyed it. So I'm looking forward to it. Thank you.
So a few of my takeaways from that conversation were, I loved that you weren't just focused on the technical side. I love that you were talking about the people side, that socio side, that it isn't just the technical side. And I loved the passion that you had for the subject. Now, whether that, just like we were talking about, that passion will wane over 10 years, but I love the passion that you had for getting it out, getting the book out there, getting the idea out there, and having these conversations. But more importantly, I love that you're running from bears. That's really important.
Yes, that was a fun part of the conversation.
One thing I will say, for people who didn't catch that first episode, this is what you should do. Go listen to that episode, go read Zhamak's book, because this episode will actually be slightly different than other episodes in this series, because it's gonna be much more of a discussion rather than an interview. So both Zhamak and I have very strong opinions, and I think a lot of our opinions coincide, and we'll talk about times where we both agree as well as disagree. And it's a good thing Zhamak's in a different continent, in case she gets mad at me, wants to beat me up.
Critical conversations are necessary to move the needle forward. So I'm looking forward to this.
I would agree. In my book, I said, for the case studies, I quoted NOFX, the punk band, which I'm sure you listen to every night, and it said, "I want dissent. I want the scene to represent." That's the quote from the lyrics. And that's what I wanted. I wanted dissent. I wanted people to say, Hey, Jesse, the opposite of what I just said. And that'd be okay, because there's probably something that ranges in there. So we definitely want to have that. We definitely want to have these talks.
And that's the other thing I really appreciate about you. You're not just saying, come at me, bro. You're saying it with, we want to beat up this idea, because we want the idea to be the best possible and have the best possible outcomes. Not that we're trying to, you know, I'm afraid of hurting Zhamak's feelings by saying something.
Exactly. I think for anyone who puts an idea out there, they go beyond just sharing it with their small tribe. You must disassociate yourself from the idea. The idea will have a life of its own, will get adopted, will get used and misused and evolve. And that shouldn't be a personal concern. I mean, early days of data mesh, I went on Twitter and got engaged in Twitter-like conversations. And I honestly didn't have the capacity or didn't know the techniques of how to carry conversation on Twitter, which is really not a constructive way of moving any topic forward.
But you have to. And then I wasn't able to sleep for a whole weekend, because Twitter space becomes very personal and personal attacks very quickly. So, yes. I agree. We're talking about data mesh, and that's its own thing, and I planted the seeds but the forest and the trees will grow by other people and other people's work.
That's kind of what I've been thinking about lately. If you plant a seed and the rest of the forest doesn't grow, then that means that your idea really didn't carry forward. So I'm really excited that as I see more and more people in the data engineering space and the data team space, they're not taking from me. They're growing this ecosystem. And if we don't grow an ecosystem, it will die. So this growth I'm seeing, you and I doing this for years and now seeing this expansion has been great.
Let's talk about some of the things that are happening, which is good, kind of to what we had before, of marketing, of vendor marketing. So if you look around, everybody's data mesh now. And what I think has happened, and what I've seen from people's comments is, they take the fact that people are marketing this as a reason why it's all hype. And I don't think that's quite true. Marketers gonna market, unfortunately. So if we look at this, we can see vendors’ Gartner's hype cycle. Where do you think we are on Gartner's hype cycle?
I think we are still early days in that rising up to the top of the hype cycle. In reality, what has happened is that there was a pain and a problem that we hadn't spoken of. Data mesh surfaced that problem and that challenge, and also proposed an alternative, which was inspired by approaches that had solved, I guess, complexity at the heart of software before now applying those to data. So it came up, it surfaced a pain point, it proposed a solution, it resonated with a lot of organizations, and those organizations had to go to their vendors and say, Well, we want to do data mesh. What can you offer us?
So I feel a lot of vendors kind of weren't expecting this. They had a strategy. Usually a lot of data vendors have a nucleus of a product, so a lot of the open source ones, they have a nucleus of a product. They have one specific problem that they solve, and now they have to fit into a new paradigm in a very short span of time.
And that's what we see. We see a quick, I guess, first level response to the demand of the market built on top of the technology that has been available for the past paradigms. And now we're trying to repurpose, or reconfigure, or sprinkle a little bit of a data mesh fairy dust on the existing products to say, Okay, how can they solve the data mesh problem with not much of a pivot, right, from the product strategy.
So, I think data mesh is not a fad. I hope it's not a fad. I think it is a real trend. And we are trying to figure out how to be part of this trend, whether we are consultants or vendors or clients adopting it, customers adopting it.
What do you think is going to happen during the trough of disillusionment for data mesh?
The way I see it is a bit of a hockey stick, as in, there is a lot of enthusiasm around, I want to get to this destination as an organization. As a data oriented, data-first digital organization, the operating model of data mesh, this paradigm, resonates with me. I want to be there. And I have to get there with the tools that I have right now in my hands. I have to work with the vendors that are available to me today. And we're gonna go down this, you know, the low part of this hockey stick in terms of optimized solutions or solutions that really increase the productivity of the users of this paradigm, right? The data providers, data consumers.
Until we get to the point that we realize, until we get to the evolution of, I suppose, technology and solution, that the solutions that come to the market and become available to us are built natively for data mesh. These are native solutions that weren't just superficially changed to fit into the data mesh. Until we get to those native solutions, I feel like we're gonna come down this hockey stick, realize that this is hard, this is not an overnight buy or build solution, this is a transformation. And now we have the right tools to really travel up that hockey stick and get the productivity and effectiveness that we need.
And I think we've seen this before. I was very active building, kind of large scale, microservices distributed system solutions, back in the early 2010s, early part of that decade. And it was the same. The promise of microservices coming from more advanced, digitally advanced organizations really captured the minds and hearts of big organizations, legacy organizations, all kinds of organizations, but the tools that we had weren't adequate, right?
We were building microservices on application service. We didn't have containerization, we didn't have the Kubernetes of the world. And the solutions that we were building were duct taping technologies that were available to us at the moment, at that point in time. And they were high cost, high friction, long lead time to really getting to that vision of microservices. And yes, half a decade passed. And then now we have the tools that are native to that paradigm, right? It makes it so easy for developers to spin up services that deliver value directly to their customers with not a lot of duct-taping. So I think that would happen to data mesh if data mesh trend survives the hype cycle.
I think I agree with you on where we're gonna hit that hype cycle issue. And I think it's going to be around everybody thinking that the positioning that people have said, My product is data mesh, and they say, oh, okay, if I just put that in, then I get this data mesh for free. And we're definitely not at that buy, as you were just mentioning. So there's that trough of disillusionment of, I bought X, technology X. That should have got me this hockey stick, and where is my hockey stick? Where is this massive increase in data value that we're getting? But I also think it's always easy to buy, it's always easy to put a technology in place, but the people side, the socio side of data mesh, I think that's really going to be the biggest trough of disillusionment is, Oh, I have to make organizational changes to make this work? Oh, well, I'm sorry. Let's move on to the next silver bullet.
I completely agree with you. And I think there is a tight coupling or dependency between the tools and people as well. Because the people side of data mesh, I mean, the whole aspiration behind data mesh was removing these silos, the walls of data and non-data, the walls of software and data. If we are truly data-driven, data-oriented, embedding data, analytics, intelligent decision making, machine learning into every function of the business, into every team, every application is a must, right? Otherwise, we're only gonna talk about it, right? Because we are excluding a large portion of our organizations to be able to use data. That solution's not gonna scale.
So to remove those silos and walls, we've got to empower application developers, empower business people, these multidisciplinary cross-functional teams that build the technology that supports the business and empowers the business. We've got to enable them to become first class data users or data providers.
And yes, there is a big organizational, educational, cultural aspect, reward aspect. All of those are organizational design aspects that need to be touched and changed. But also, the tools and technologies need to feel native and well integrated into how those teams work. And I feel that's the gap that we still haven't filled, even with the vendors that are claiming, you know, data mesh solutions or data product solutions. They're still selling to those old parts of the organizations. They're still selling to the data teams that are separated from other application teams.
So I think you are right that the organizational part requires a transformation, the change of hearts and minds, but also the technology needs to bridge the gap for people to shift and change.
You touched on the technology needing to change. When I started talking about Hadoop and early on with Hadoop, and then the change to Spark, and then the change to these other things, I talked about them in generations. I talked about Hadoop being a first generation. It was good for what it was at that point in time. The problem was Hadoop was a first generation. And then as Spark came along, we had our second generation. And now what is our third generation? And what we get is in each one of those generational changes, we get something that is even more fine grained, even better at attacking certain use cases, still a general purpose sort of thing, but they've made this easier and they've made that easier. And they've been able to do that because there's maturity in the technology, in our use cases, how we're doing what we're doing.
But there's also, they started this other thing from scratch, and they took the ideas from that one, made them even better, streamlined it. And I think we're going to see a very similar thing with data mesh. We are going to see, maybe what we're talking about right now is a first generation. It was, we're co-opting, we're taking some technology, we're co-opting and we're saying it's data mesh. Well, second generation, it's going to be somebody maybe doing a startup right now. They're going to have that first one that really aims at a data mesh.
Yes, I'm really glad that you used the word use cases. I think for people that are in a position of evaluating or buying technology, you've got to really think, what was the motivation, incentive, and the use cases the technology was trying to solve before deciding that it's gonna fit into the new use cases. Because in the past paradigm, we had made an assumption that the value stream of getting data to value, data to insights, data to machine learning models is this pipeline model, right? Pipeline of moving data through the pipelines, and then the stage of the pipeline of putting in, modeling it into the lake or warehouse, and then the next stage of the pipeline, overlaying governance and metadata on top of it. And then the next stage of the pipeline, designing access controls, and then finally feeding that into insights and reports and machine learning. And then that is the value.
So this long value stream that we are still playing with are formulating and constructing the use cases that the past technology has been trying to solve. Data mesh reconfigures this pipeline to really go from data to that machine learning model in a much shorter, closer, tighter collaborative style of architecture. Hence, creates a new set of use cases, hence requires kind of reimagining the technology or reusing or reconfiguring existing technology, for a completely new set of use cases that we hadn't imagined before. And I think that's the interesting wide space for innovation to come.
So you talked about a possibility of data mesh falling off the hype cycle because of some problem. So let's imagine you and I are talking about this 10 years from now, and we're saying, What killed data mesh? What killed data mesh 10 years ago?
Yeah. We went too fast and burned ourselves really bad. And then we blamed the paradigm, right? I feel that if people are looking for short, quick solutions today, and we don't go through this thoughtful and perhaps a longer term process of evolving and making the shift, the paradigm shift, and pick up a solution off the shelf, retrofit it, jam a technology into the organization, and then say, well, that didn't work. I bought a data mesh solution - it didn't solve my problem, so data mesh is the problem, right?
And we blame the paradigm and we move on to the next buzzword, to the next buzzword. Buzzwords in the data space are very short-lived, right? Data hub and data vaults and data lake and now lakehouse. And so it becomes just another buzzword. And I really hope that in this journey, we come across people that are thoughtful around the transformation, thoughtful around the technology to really, really bring what I depicted in the prologue to life. And don't look for a quick solution to make it happen overnight.
I think that I kind of equate a lot of things to Hadoop. And what you just talked about reminds me a lot of Hadoop. It was, Well, I brought in a technology, we failed at it - Hadoop's fault. Then we move on to cloud, we fail at cloud - cloud's fault. And I talked about it in my book. It's far more difficult to look inward and say you know what the pattern is? If you have a Venn diagram, it's you right there in the middle. And there it is. It's far easier to point outward, but it's you right there in the middle each time. So make sure that you're looking inward and actually having that honest thing.
And, I agree with you that I think that's what would kill data mesh. It would say, oh, this data mesh didn't work day one, didn't work day 20. Or perhaps a worse case would be a whole cottage industry forming up around this, of people who really don't know what the hell they're talking about and pointing people in the exact opposite directions. And that's always a worry, as I see in this - Step one: Glob onto buzzword, Step two: Question mark, question mark, question mark, and Step three: profit. And maybe that's what we say in 10 years. And we have to wonder how much we made in our profit.
Yes. And I am really worried, because the data technology space is a very hot, hot technology space. The amount of investment going into all kind of data startups or data vendors, it's just mind-boggling. I see a lot of demos of different technology solutions these days. Some of the vendors have the, I guess, interest to see what I think, and also explore how they can fit into a data mesh ecosystem.
And some of those technologies scare the hell out of me, because I think, what have I unleashed to the world? Because I see how this technology was applied at a scale problem, like a problem space, a scaled organization, how catastrophic can be the result of that. You know, I see technologies that somewhat fit into the class of no-code, low-code, drag-and-drop wizards, and just sprinkling a bit of SQL queries on top of databases. And do you know, virtualization.
That category of technology, I just can't see how that enables data mesh, if that becomes the layer of your technology that controls the experience of developers and users. Data mesh at heart is trying to bring software engineering practices, engineering practices that have proved over and over to be fundamental to building responsible, sustainable, and scalable solutions. And some of these solutions, I just can't see them to be that layer that creates the scalability and resiliency.
Maybe they can be a piece of the puzzle deeper in the layer, or maybe end of the mile. Not the full integration fabric. As an integration fabric, they just break some very fundamental kind of architectural practices that lead to resilient, scalable solutions. So yes, I think we have a bifurcation of responsible and kind of irresponsible fast and quick, but not sustainable, and maybe slower technologies to build and integrate today a more sustainable, scaled solution.
So you mentioned before that you've been reading a lot of people's writing about data mesh. I know I have too, both people's posts and comments. And so what do you think people are getting wrong about data mesh consistently?
Yeah. I think there are different, probably, classes if I want to categorize them. There is a class of writing more critical of data mesh. I think a recent title that I saw was "Data Mesh Is The Fool's Gold," or something along those lines. And I see that the writers', the authors' understanding of the data mesh is the marketing or the advertisements that they're getting on their LinkedIn feed, right? Or their Twitter feed.
So, and very rightly so, when I read their writing, I go, yes, you are right. If somebody thought that I can buy a data mesh technology and it solves all my data mesh problems, I have data mesh overnight, this is a terrible idea. And if the beginning is not your business strategy, it's your data solutioning, yes, if you assume that data mesh is that, data mesh seems like a terrible idea.
But the fact is that the authors haven't done the research, they haven't probably read part four or five of the book. So it's a criticism that comes from lack of depth of understanding that socio part, maybe the socio part of it is missing. And they think that, oh, it's a technical solution that I've been sold, yet another technology solution that I've been selling. And I empathize with their point of view, but their point of view has blind spots, and I encourage them to do a bit deeper research.
The other part, I think, when I go a little bit deeper and scratch the surface on some of the technical writings, the understanding of what data product is also seems to be fairly limited. There is a set of writings that is just an extension of what we have done in the data space. You know, data ETLs, data pipelines, sprinkle governance metadata over it. And now they're just being slightly extended and expanded to be presented as a data mesh solution. And I think those are also somewhat misleading, because it's just an extension of what already exists.
Data products are presented as SQL queries that you run and materialized views with additional metadata, let's say. Which is not what I had imagined for a data product to be. So I think that's another category, the category of extending existing technical solutions, architectures we have with a few, you know, with a few bits and pieces to fit into data mesh. Like you have an event streaming, for example, architecture in place, and now you're calling it data mesh, which is, again, incorrect. Data mesh is not one mode of access. In fact, if you can't run distributed machine learning model training across your nodes on the mesh, it is not data mesh, right? So event streaming doesn't fit into that model as it is defined right now. So I think these are the categories of different misleading, I suppose, or constrained, limited writeup that I see around data mesh.
And the other class you touched on, the classes I've seen. There was one other class that you're going to see more and more, unfortunately, and that's the SEO class of writing. It is, Gotta get me some boost on Google, need to write that data mesh so that I can be at the top of it. So, it's kind of hypey, and what I always find interesting in those posts, I don't know if you've ever read them, is, you can clearly see the person who wrote it has no understanding, not just of data mesh but of technology either. And I find that so, so interesting that somebody would put the effort into doing that, just to get up on Google. I assume it works, maybe it doesn't, but the sheer low information density of it is pretty crazy.
Yes, absolutely. And again, this is a moment in time. For anybody else, I guess, lessons learned, that dares to put their ideas out to the public and that idea goes viral, hopefully for good reason, be prepared for the imposters and people that are taking advantage. And as you said, they're just optimizing for Google search load. And it is painful. It is really painful.
I have stopped reading articles or looking at them, because it's just, it drags you back in, right? That's what dragging me back in during my sabbatical. I was hoping to just play tennis and go run away from the bears in the bush, and, you know, travel. And that's been impossible, because you feel, Okay, I'm not done here. Like, there is an opportunity for a very great future for innovation, and that opportunity is being missed slowly, like the sand through your fingers, running through your fingers, by ad-fueled publications. And yeah.
In my time, as a software engineer, my interaction with data was, oh, it's in the database. You just throw it in the database and you pull it out. That's all I really care about. And as I've interacted with kids these days, shaking my fist at them, there's even less understanding of even databases. It's frankly been surprising.
So I still think that there's a whole other problem to this that we'll hit off, oh, you went to school and you don't even know how to use a database. Here we are teaching you how to do this, this and this, this let's do a distributed, and they're still back at step zero of what's a database? How do I interact with this? We've got some, we have some industry level problems to deal with. It also reminds me of a client that we had of, man, it was pulling teeth to get them to do those metrics that you're talking about, where we were saying, Hey, if you expose these metrics, if you do this right, not only is this your job, you're at a startup, your stocks are going to be worth more. Please do this. Still, couldn't get 'em to do it.
So, yes, so software engineers today, you are right. They don't care. They can't cross that chasm to become data product engineers and providers, but the moment that we really push that need and demand into the software – and that has to come both from the top and the bottom – unless we make that change, data mesh is just a forced responsibility that no data software engineer would care about.
You know, software engineers are fantastic problem solvers. You've got to throw the right problem at them. And what data mesh says with domain-oriented data teams and domain-oriented, cross-functional teams is trying to throw the right problem.
The problem is not, I want to get metrics out of this database or I wanna run an ETL against your database so that I can do analytics somewhere else. That problem does not excite anybody, because it is a one step in a series of steps that at the end of that pipeline, the value materializes. So I think the problem we need to throw at those teams to really embed data-driven kind of decision-making and applications is that I want your application to be much better or uplift the number of listeners that are onboarding, continuing with that example, I want this much more traction using the data and applying data-driven solutions.
And I think if we articulate that as the impact that the data can have on their business domain, maybe we have better luck. And also giving them the tools. They don't have to scratch their head and say what tool? They don't have to duct tape a technology that doesn't really fit that model because they're not really familiar with the technology. So give them the right tools and then throw the right problem.
Well, now let's talk about the inverse of that. So some data engineers, top end, very, very good software engineers, very good data engineers. And that's my definition, that's a software engineer who specialized their skills. So now we have this issue of, we have these top end super smart people that now they're gonna jump in this with both feet, and what happens when the buzz dies down? And I also wonder what happens if they get bored or complacent, where they realize, I only do the really fun, challenging part of the job 10% of the time, and then 90% of the time I'm dealing with the boring, mundane, whatever part of the job. What happens then?
Yeah. So I think those great data specialists, data people, would get a lot of positive recognition and feedback when they see their hard work using, providing, managing the life cycle of these data products is paid off by having a business impact. I can't imagine anything more rewarding than actualization of your services, your efforts in making an impact on a customer, a real user, the business, right? So I think when you think about the maturity of engineering, again, that rapid feedback is very core to how we get excited about the work we do.
So if you're a data engineer and you're focusing on the infrastructure, your end users are those domain teams. So their rapid feedback on they're actually using the infrastructure that you provided, and it gave them, I don't know, a 10x lift. They don't have to do all of this, you know, infrastructure provisioning anymore, and you're providing the tool that they actually need to do their job, you know, optimizing onboarding of the listeners.
That positive feedback of seeing the end user using your products like you and I, we talk about the end user, reading your book and reflecting on it, right? That's a very positive and satisfying feedback. And hopefully that actually is quite a challenging job. It's no longer, you know, cleans this data that you don't really know where it comes from or, and then shapes it in a form that you don't know how it's gonna be used, that data engineer is being stuck as a middleman. I can't imagine, I mean, I talk to a lot of data engineers and I've had teams of data engineers at clients. That job is not very satisfying, cuz you're just a cog in a very long stream of cogs to see the value, to materialize the value.
So I hope that whether you are taking a data infrastructure role, or you're taking a data product/developer/user/engineer role within the domains, you are actually that, that feedback from data to value is much shorter and much more direct. And hopefully that brings a lot of satisfaction.
I would agree with you. So I think software engineers or data engineers get into this to see their stuff in use. And so the further you are from seeing that, where I create this ETL that takes data from this database and puts it here, not as interesting as seeing, oh, my direct work went into this lift.
And I think this is an important part, I talk about it in data teams: When kudos are given the kudos should go around to the people who did that. So it shouldn't just go to the analyst. It shouldn't just go to the data scientist. That data scientist, that data analyst was maybe facilitated in doing that by the data product owner or by the data engineer or by - let's give kudos to the people in the background whose stuff isn't in front of the face of the CEO. I think these are key things that we need to start doing now.
Yeah. And to give you a real world example, data mesh is a peer-to-peer kind of data as a product value exchange, right? Data as a product is a valuable thing that you peer-to-peer share. So we had a team that, you know, COVID hit, they were building a data platform for one of our clients, and they were building data products, 300 plus data products. COVID hit, and then, you know, in a span of a few days or weeks, maybe a couple of weeks, we had to bring up a new data product that was essentially capturing, the source of it was the chatbots on the website that was interacting with patients and providers around COVID, around the symptoms. They were doing kind of analysis of the text and turning into kind of insights around the conversations.
So the transformation and the computation of the data product include voice-to-text, NLP, and then surfacing that as trends and analysis around the COVID symptoms in the COVID population, and so on, there were a bunch of COVID-oriented data products. So there was a massive amount of data engineering in building those data products, there were some pipelines. But the team as a whole was responsible for providing COVID-related data as a product directly back to the providers back to the payers. And you can see that's just not, oh, I'm just gonna get the chatbot data and dump it to a file, and hopefully somebody else will use it. I get meaning out of it.
It wasn't that. It was providing that COVID data as a product that directly is used to have an impact. And the team worked, you know, day and night. Of course it was, you know, overnight we had a fair bit of infrastructure in place and, and some of it not in place. And I think that was super satisfying.
Well, let's go deeper into that. How long do you think it takes somebody in, in this case, they still had stuff, but they were able to get value from data mesh relatively quickly. Let's say somebody is at the very beginning of that journey. How long does this take?
Yeah, I think it's an incremental and evolutionary journey, and we get value at a smaller scale, and then you get value at this larger scale. But I think we had our very first data products that we were creating. You know, we had six months of investment of a medium size team, you know, 10, 20 people working on the platform. At the time that this use case materialized, being able to create such a sophisticated data product, a set of data products in a matter of days, well, they probably had already that infrastructure team in place for, and a larger one, for over a year, if I get the timing right.
So yes, so it's not weeks, it's really months and years to get the platform and infrastructure in place to materialize and exploit, get value from that investment at scale. And I really hope that the future technology shortens that lead time and investment.
And I hope so too, and I think if we've seen everything from that generational example I gave, yes, we'll get there. One of the things I see, though, is that, we talked about complexity before, that there was an issue of, really few companies hit that point where friction was their main problem, in my experience, and that their main problem was more foundational, that it was only a few companies that hit the point where friction was such a problem.
And I think that's the real key that data mesh fixes, is that friction, but if we have so many companies stuck in that implementation, will we ever get to the point where we see this usage of data mesh be all over the place?
Yes, I hope so. I hope that the accessibility of data mesh, you know, increases, more companies can, you know, have access to solutions for data mesh. Again, for me, it's a bit of a deja vu with microservices. It is a similar problem that we're trying to solve. Technology, unfortunately, on data space has taken a slightly different trajectory than the technology for microservices had. So I think there's a slightly bigger discord between the data technologies and data mesh than maybe perhaps it was between the microservices and technology back then.
But I'm hopeful that data mesh becomes more and more accessible for organizations. It's still a fire burning inside me, and I'm hopeful. And I think with the kind of massive uptake of data mesh with organizations, I'm at the epicenter of this, I see a lot of interest in scale-ups in larger organizations for adoption of data mesh. So I hope that we will have more and more success stories and lessons learned to share for the next wave of adopters.
Now, related to that, you talked about the company that was doing this, the data mesh with healthcare data, how many companies have actually completed their data mesh rollout?
I don't think there is a completed, I mean, I don't think that completed rollout actually makes sense because as long as your business is in the business of getting value from data, every day, you find new use cases, new teams. So if the question is, have I, I guess, saturated the number of domains within the organization that have the potential of using data and sharing data? I think I can't call a single company. A lot of our clients or the clients that I have worked with, are really large organizations, so their businesses are so multifaceted and often they start with one part of the business.
I can publicly talk about Roche For example, because they, you know, they publicly talk about their data mesh and their work with us, again, Roche is such a massive organization, with so many different functions, but we are rolling out data mesh in, let's say Roche diagnostics, Roche manufacturing sites. So there are still so many different areas of the business for this to expand to.
Yes, so, I can't really call it a company that has fully saturated all of the domains, and I think we’ll be at it for another few years for that to ever happen. And the moment that we think that we've saturated, I think a shift will happen, and the shift will be either in the use cases - the business discovers new data driven use cases that they hadn't imagined before - or the business expands this function to new points of collection of data, and then the S curve recurs. So if you think about the S curve of adoption within an organization exploring, expanding, exploiting, I think the moment that we feel like, okay, I've rolled this out and I've hit exploit, a shift will happen, whether it's an environmental shift, hopefully not another pandemic, but an environmental shift that triggers the explosion of new use cases. Or it's a, maybe an infrastructure shift that creates a shift in restructuring and innovating in the infrastructure space, or it’s the shift in the expansion of the business that leads into yet another S curve.
So let's say I work at a startup, I'm sitting here listening to this podcast. Does a startup need data mesh?
I think you would be spending, if you think that you need data mesh, I think you would be spending very precious resources on metal work, on building infrastructure. So probably my answer is no. My answer would be, as a startup, you have to be hyper-focused on your market fit and the actual end customer problem that you are solving with the minimal and simplest technology solution. So market fit is your goal.
So I think data mesh at this point in time will take a lot of resources from that outcome and focusing on building infrastructure and doing metal work, which is very exciting. As a developer, as a technologist, we love to wire things together and operate infrastructure. Those are fun problems to solve. So I would say no.
Having said that, if your startup is about integrating data from a lot of sources, so let's say you are a very niche kind of startup in healthcare and your offering is surfacing and providing data products, diverse set of data products integrated from an ecosystem of, I don't know, pharmacy and healthcare providers and so on, for that particular startup, maybe data mesh is the right approach, because that's just the model of your business, right? That's a business problem that you're trying to solve, is, creating a data mesh is the business problem you're trying to solve. But I would imagine that's just a very tiny fraction of the startups out there.
I think that's worth pointing out in this podcast. So I think this is really a key insight that I think people really need to understand before they embark on this.
Yes. And I think if you are a scale-up, so you have, if you think about the curve of S scale for an organization, you have solved your market fit problem, you have a solution that works for your market, you are very much focused on data, let's say you are the Spotify, Etsy, you know, that kind of companies.
And now you've hit that scale and acceleration, and you are feeling the pain points of the architecture bottlenecks, like your warehouse or lake or your data team is, you know, under a lot of pressure and they cannot deliver to your data-driven solutions, I think if you hit that bottleneck and you are growing and you're considered a scale-up, it's worthwhile looking at data mesh to resolve some of those bottlenecks.
You've said that you've done this with some big companies. How do you get executive support for data mesh?
It's really interesting. In fact, based on my experience, many of the companies, the execs are behind it. I mean, unlike perhaps microservices that were more of a grassroot start from the engineers kind of sneaking it in the back door, with data mesh, a lot of the people that I talk to are the executives, because they are seeing, they have visibility to the strategy of the company long term and how the company wants to become data driven.
They have the view of the expenses and the cost that it takes them to get there. So they are in a perfect position to be able to influence the strategy around the data. And they come, you know, with the question, whether data mesh can be the right thing for them. Of course, usually the architects and technologists are also behind supporting that.
But if there was, I'm trying to think if there was a case where we had to convince... I think early days where data mesh wasn't as known, because we started doing this before data mesh had a name, I called it “beyond the lake,” I didn't even have a name for it. So it wasn't so well known within the organization. The conversation with the executives was the conversation I've had with the public, which is pointing out the inherent limitations of the approach they've taken and really building on the pain and experience that they had already had.
I think that's a good place to influence, get their executives to reflect on what they have built and the mismatch and discord between what they built, what they spent and the results they get, and then offer an alternative. And I think that those conversations in the early days of data mesh where data mesh didn't even have a name were very helpful.
When did you come up with the name and who came up with the name, maybe?
Yeah, I came up with the name, when was it? I think before writing the article. So I gave a few talks. We were trying to, kind of, implementing it or talking about it. I was trying to, internally, talk to our clients. Didn't have a name for it. I gave talks and called it “beyond the lake,” but I think in 2019, before I wrote the article at Martin Fowler, I think I wrote the article in May, 2019, if I remember correctly, I have to give it the name.
And it's a bloody hard thing to do and I've been accused of oh, data mesh existed before and somebody had used it, but it wasn't widely known, so. Data fabric was already taken, so I had to come up with a name, and back then I was, it wasn't a very creative process either. I was really into microservices and service mesh was a thing that I really liked as an innovation in that space. So I just called it data mesh.
So probably one of the most validating things for somebody is to have your idea not just run with, but also co-opted. What's it been like seeing people go on a mad dash for vendors and saying, “we're data mesh now?” Is that validating? Is that scary? What's happening there?
Mixed emotions, really mixed emotions. In the moment, of course, you get a little buzz of being acknowledged and being recognized. And I think vendors have been quite respectful of the source of the idea and mentioning that. But then, you get scared that the consequence of something that you unleashed could be negative, and that becomes a motivation to move again and do something else to prevent negative consequences, right?
It becomes a source of energy to move. So mixed, mixed emotions, aspirations, inspirations, scared, sad, excited, feeling proud a little bit for a moment, for a second. All of those emotions. But I think at the end of the day, they all become a source of energy to keep moving.
We talked about data mesh, the worries about data mesh in 10 years, but we didn't talk about where would you absolutely love to see data mesh in 10 years?
I really hope that it can bring the picture that I paint in the prologue to life. I really hope that we can, you know, step into every organization and see that data experimental culture is really embedded everywhere, and I really hope that nobody talks about data mesh anymore. I hope data mesh becomes irrelevant, as in, it becomes, you know, the outcome of that has come to fruition and we have embedded data and intelligence in every team and every business, and in a peer-to-peer fashion with open protocols. We're sharing data, we're sharing computational data. And we don't talk about data mesh anymore, because it's done its job and it's now hidden in infrastructure, and it doesn't matter anymore.
It fades into the background.
It fades in the background.
Okay. And what about you personally? What is your personal relationship optimally in 10 years?
I hope that I have made more contributions to that reality, beyond what I've done so far. I hope that within that 10 years, I've gone to the trenches and contributed to solving some of the hard technology problems, product platform problems, and then I can be, and I can rest peacefully that I, what I unleashed into the world hasn't caused damage, or there are enough tools to prevent the damage that it can be, that it can come about through its misuse.
I was hoping for something like benevolent dictator for life.
Oh, God. No, no, no,
No benevolent dictator?
You know what, I hope that my next sabbatical is less stressful. I can go and play tennis and run in the woods and swim and not think about, I don't know, technology, in my next sabbatical. And I actually can have a Sabbath, and have a rest.
That's the metric, we'll ask you that in 10 years. How's your sabbatical?
Yes, exactly. I'm just running and I'm swimming and playing tennis, and that's all I do.
Just like Rome wasn’t built in a day, data mesh can’t be explored in a day. How about we stop here, and come back next week?
Same bat time, same bat channel.
Another great story, another perspective shared on data, and the tools, technologies, methodologies, and people that use it every day. I loved it. It was informative, refreshing, and just the right dose of inspiration. Remember to check dreamteam.soda.io for additional resources and more great episodes. We’ll meet you back here soon at the Soda Podcast.