S
4
-
EPISODE
1
Data Masters Podcast
released
September 23, 2024
Runtime:
27m00s

How AI Enhances Human Collaboration and Drives Innovation with Nishit Sahay of Marvell Technology

Nishit Sahay
Chief Information Officer of Marvell Technology

The Data Masters podcast explores the intricacies of modern data management, featuring insights from industry leaders who share their innovative strategies for driving business success through data. In this episode, Nishit Sahay from Marvell Technology discusses the company’s diverse semiconductor solutions, the evolution of their data strategy, and how they’re leveraging AI to enhance decision-making and streamline operations. The conversation dives into the challenges of managing complex supply chains, ensuring data quality, and scaling AI-driven initiatives in a fast-paced, global tech environment.

I'd rather read the transcript of this conversation please!

"You're tuned in to the Data Masters podcast. In each episode, we dissect the complexities of data management and discuss the data strategies that fuel innovation, growth, and efficiency. We speak with industry leaders who share how their modern approaches to data management help their organizations succeed. Let's dive straight into today's episode with Anthony Deighton.

So, Nishit Sahay, welcome to Data Masters. I really appreciate you making the time.

I thought to start, we might start a little bit with Marvel Technology. I'm not sure everybody is super familiar with Marvel and may not have complete context and understanding of the business. So maybe just start by sharing a little bit about the company, and sort of introduce people to Marvel Technology.

Sure. Absolutely. And it's perfect timing because, you know, a couple of years ago when I would say "semiconductor," people would give me a blank stare, saying, "What is semiconductor?" And I used to have a joke that most people think we just make bad conductors—it's halfway. So, what we do is, we are one of the top semiconductor companies in the world. In fact, if you look at anything AI lately, some way or the other it is powered by our technology. But beyond that, we actually make chips or semiconductor devices for carriers. You know, a lot of 5G networks have our components in them.

Cars, you know, we are big in the automotive world, but also in what you call the enterprise market. So, you know, if like I'm a CIO, I buy a lot of stuff—switches and network connectors and whatnot—a lot of that has Marvel components in it. But the place where we are really differentiating lately is the data center. So we have the data center business, and within that, we are a key player in the AI-accelerated computing space. So that's what Marvell does. On a day-to-day, you might not be seeing our brand when you use things, but be assured most of what you use has, somewhere or the other, Marvell technology powering it.

We have over ten thousand IPs, so it's a very IP-centric, very engineering-focused company. In the semiconductor space, one of the most respected companies that's out there.

So again, I think it's fair to say you make a very wide variety of products, and as a result, almost certainly have a fairly wide variety of customers. And, you know, in that sense, I think also it's not as though listeners to the podcast would go out and buy Marvel products, but they almost certainly buy products that have your parts inside them and make them better and sort of advanced versions of the potential product. But how do you think about customer relationships in that context? In that sense, how you think about the customer could be a bit different. I imagine that you have a lot of customers, and there's quite a lot of variety in those customers and the customer relationships you have with them.

Yeah. So, when it comes to our customers, it's a very close-knit relationship that you'd see with a company like Marvell. I mean, there are different types of products that we build, as you rightly said. In some cases, we actually do custom products for the customer. That is, you know, we're collaborating with them to define how the product should be, and then we are building it all the way through and, again, in deep collaboration with the customer. So it's a very, very high-touch relationship in those areas. And in some areas, we have mass market. So we do it through distributors and all that because Marvell has great chips and everybody wants them. But if you really look at it, most of our customers are big customers, and you're talking about heavy infrastructure investment that they're doing, whether it's in all the markets that I talked about.

So, when it comes to the relationship, our customers tend to be very knowledgeable about the product because it's a B2B relationship. And that's what I see in the technology world—Marvell is a very big brand name. But you see in a regular, commercial world, people might not be as aware unless you're in the Bay Area.

Sure. And I love that sort of idea that there's this really intimate relationship. You have on your side smart engineers that are really thinking about the underlying chipset, and they're working in close collaboration with that customer to think about how that integrates into their product.

So let's tie this back to your data strategy. Maybe start with just a little bit of high-level context to tell us a bit about the data strategy and how it's changed and evolved over time? Because I know it's very different today than it was a year ago or five years ago.

Yeah. No. It's a great question. So the problem statements that we are dealing with are somewhat determined by the industry we are in, also determined by where we are geopolitically because, you know, the technology industry is very widespread. And then finally, as you mentioned, the layering of customers and the products that we build in. Right? So it's a very complex supply chain that we have to deal with too.

So, I would say five, six years ago when we started to mature in our data strategy, it was very metrics-driven. You know? I would say very finance-driven that, you know, we need to get our numbers right, and we need to have the right data feeding into our ecosystem so that we can get that more efficiently in front of business.

Now, if you look at the history of the company, we scaled up pretty fast. We did some very strategic large-size acquisitions. We brought in a lot of different flavors of technologies into our portfolio to make it more complete, and then we expanded fairly decently. So in those cases, a lot of data we needed were decision support data, right? We need financial data. We need data around customers. We need data around our products.

So we started with a pure BI-type strategy. So you have a request on the other end, we'll figure out where the data is, and then we'll pipe it through. And, of course, we used that opportunity to also clean the sources. You know? Because, again, if you look at the journey we have been through, Marvell has done a lot of work pretty much in all angles within the company. I mean, not outside, we're doing great stuff. But when you talk about technology foundations, we have been doing quite a bit of work on cleaning things up to scale. Right? So we did a lot of work on the source side as well.

And those things are good incremental improvements that we made. And, you know, like I said, the decision support systems that we built were pretty useful when we did the scaling up through acquisitions. Now, a couple of years ago, we started to drive what people call data-driven transformations. So now we're looking at larger datasets across the company that we had to bring together to make key decisions. So Marvell is a very data-driven company, always has been. And, you know, in most cases, there's some pull request from the other side saying, "Okay, I have to make a decision on a regular basis," and the Marvell ecosystem expects you to have the right data in front of people. Now, that kind of slows me down in my decision-making because I can't present that data to the stakeholders on time. So we used to go chase that kind of data, chase the kind of decision-driving factors that people have, and then go all the way backwards.

But then we started to look at the whole ecosystem we have and how, you know, individually, all of them are working versus how they're working together—something like integrated business planning. So we started to stage data not just based on requirement but based on the business process, the end-to-end business process. And there, you build the ecosystem where you can ask any question. And it's not like a six-month project every time you have a requirement to get the data through. So that is, I would say, stage two.

Stage three, which we are in, is a little bit more complicated. Now we are getting outside the realm of your structured data, things normally which are clean in a company if you do your upstream process. Right? Now we're getting into a wide variety of data because we want to bring engineering knowledge to our systems. So with that, now we're looking into data governance, more advanced data pipelines, all the key latest and greatest that you can think about—data mesh or data lake houses. So now this is a journey that we are on, and a lot of that is going to be your basic decision-making process, but a lot of it is actually going to be feeding into our AI pipeline that we're building very aggressively in the company.

Yeah. So not to go too far backward in that story, but where you started, I think, is probably where a lot of companies are today, and they're thinking about what are the key metrics that matter and then building this—I think you called it a request-based idea that people make the request. I call it sort of hunt and peck. You know, people make a request. Somebody has to form a dataset to respond to that. It gets placed in front of them, hopefully reasonably quickly, but probably measured in days and weeks, in a dashboard. And I think that's a very common place where listeners are today.

I do love this idea that you focused on remediating sources and getting sources into better shape. How did you—how did that go? My general experience has been that trying to get source data to be perfect can feel like a futile exercise. As soon as you get it a little bit better, then, you know, somebody makes a change or a new source gets added. How did you guys think about that?

No. That's an excellent point. In fact, one of our learnings was sometimes it's okay to leave the source the way it is because the return on cleaning it up is probably not very good. Now, in terms of maturity, the initial set of source cleaning that we were doing was also essential for the company. Like, you have to have your ERP processes done well, your supply chain processes done well, your CRM processes done well. I mean, that's mandatory for a company to be efficient. Right? So there, it made sense for us to go to the source and clean it up.

This requires a big cultural, I would say, support from the company. So if your company has a culture of collaboration, if your company understands the value of data, the value of clean processes, then it's much easier to drive. So for us, a lot of our job was much easier because Marvell does have that culture.

And there, we had an alignment saying, "Okay, sometimes you have a process that is not the cleanest process in the world, but it does the job." So in those cases, just have a good control process around it so you know that if this is happening, at least your dataset that's coming out is the right dataset because, ultimately, you're making decisions based on that data. In some cases, we automated the system to make it fully foolproof and reverse.

Now the good thing there was we had enough time early on. Because if you look at six, seven years ago, things were not so dynamic. Right? I mean, they were moving at a certain pace. Our path was pretty much defined by us. So if Marvell wanted to move at a particular pace, we moved at that pace, and Marvell moved very fast. So that's why we moved fast on the data cleaning side too. Then comes the 2019-ish time frame, where it looked like everything was thrown up in the air, and anything is a variable that you can think of. Right? Geopolitical situations got in, the pandemic got in, the technology world was upside down with the supply chain crisis, and then there was a huge unprecedented demand that you saw in the silicon space, then you had a huge unprecedented drop of demand in the silicon space. And then AI came in. So now the question here is, with that kind of dynamic surrounding you, where do you invest and what do you prioritize?

So again, we are still very outcome-driven. So while I talked about looking at the entire end-to-end business process, that is for an outcome we are trying to drive. For example, we want to have a very efficient supply chain. We want to provide very good customer service and, you know, drive our top line and bottom line to the perfect point. So that's an outcome. And then you're looking at the end-to-end business process, and you're looking at the end-to-end data models and trying to put that in place. So that is a journey that will change as the company's strategy changes, which at a macro level, doesn't change that often, versus the earlier one I talked about, where you had more time in hand because things didn’t change that frequently.

What I hear you saying is that one of the big drivers of your data strategy is your business strategy. So, as you say, things sped up. We had a global pandemic, and all of a sudden, being agile with data became really important because it was driven by your business strategy and the outside reality that affects the business.

I also know that you are, and you said this, fundamentally an engineering-driven company, a company that builds engineering products for engineering people by engineers. I also imagine that internal dynamic of who your customer, your user, your internal data users are almost certainly affected your data strategy. Fair?

That is true. So I think the initial part of our work was more on the non-engineering aspect of the work, you know, the company. Like I said, the supply chains, SG&A functions. Because apart from being an engineering-focused company, we're also a growth company. Right? So it's a company that is going to scale, scale into new markets, scale into new sets of customers, and really go up in volume. So a lot of work was in that area. So even though we are an engineering-focused company, we are a business that we have to run very efficiently.

And if you look at the problem statements that Marvell has been facing, while we are doing great work in terms of the technology we are in, the nodes that we are advancing ourselves in, we rely a lot on our ecosystem, our supplier ecosystem. So we're a fabless company. It is very critical for us to have a very strong partnership around us, and it's a global partnership. Most of the manufacturing is not located in one particular zone; it's very spread out. We also do specialized parts. So it's not like, even though we have the usual list of suppliers that you would hear most connected companies have, we also have some specialized ones, especially in our modules area.

Now you're talking about a very complex supply chain. And you talked about all these factors that touch the supply chain—global factors, India-US-China relationships, or you name it. For us, we have to make quick decisions around that and have good visibility. Before the decision side of it, you have to understand what's going on so that you can decide certain things. Right? And going back to your initial point, yeah, I mean, initially, a lot of companies right now, what they do is somebody says, "I want this report. I want this data. I want it to be presented in this format." One of the common problems that people face is when the information is in front of them, they change their requirement. Right?

A typical IT grief.

You had a business requirements document, and you completely changed it as soon as they deployed the report. But there's a legitimate reason: you don’t know what you want till you see the data. So the approach we're now taking is: let's get the data in front of people in the right business process to align with their business strategy, and then we'll figure out how they want to visualize it, how they see it, how they want to slice and dice it.

Yeah. So sort of getting agreement on those core datasets that matter once connected to the business challenges and then thinking about that analytically. Now you mentioned stage three where you're investing today. You mentioned how you're thinking about things like unstructured data, providing structure for data that might not have traditionally entered into your analysis historically. Now you're bringing that kind of data in. I imagine you're also thinking about tagging or quality, classification of data. But share a little bit about where you find yourself today and sort of how it goes from here.

Yeah. Actually, that is a mountain of a problem statement that we have realized we had to solve. Marvell sits on a lot of data. Even though we are not a B2C company, B2C companies tend to have a lot more data than we do. But we have a lot of technology data sitting in front of us. Now for companies like us, again, if we have to really take advantage of the AI revolution, the real meat is when we bring our context, our knowledge, into it. Now the question is, how do we bring it? And that's where the unstructured data part of it came into being. So it's not so much about reporting, but it's more to feed into the AI pipelines that we are developing.

Now, in terms of where we are with it, we are in a "What should we do?" mode. Because when you start to drill down into this problem statement, you're looking at it from various different angles. Right? Well, companies like us are very good at external security. This is an example. We do excellent source system security. But as soon as you move the data across, how would you track if you're doing things right? So that's one example on the security angle. But now if you compound that with the compliance need in most of the geographies that we actually operate in, whether it's the US, Europe, and lately in India and Vietnam and China—China, of course, has been very popular in that area in terms of their strict requirements and compliance. When you bring this content in and feed an AI pipeline, your data protection, your data compliance requirements just become multifold. Right?

Now what we are doing is we are trying to understand what we have. So there are two things. One is, we are building these AI pipelines, we are building these AI systems, and as we feed them, we actually look at the data and make sure that everything is clean. But now we realize if we had to scale it up—right now, most companies are in a build state. You're piloting things for a hundred users, two hundred users. You're curating data for those engines. But if you want to scale up, if you want to feed in a large amount of data for this AI system to really move the needle, you have to look at data foundations.

Think about it. Most companies have data all over the company. Most companies don't think about data quality when they are feeding data because it didn't matter to them. Right? If you're writing a technical spec, let's take that as an example. Everybody knows this is the final version of the technical spec that has to be used by the developer and the product is built. Then who cares about the technical spec until a revision is required?

Now, you might have a hundred versions of the technical spec, and one is the one that you really want to use to feed your pipeline. Which one? So you have the redundancy issue that I just brought in. In the data itself or in the spec itself, you might start putting in people's information, like a requirement coming from X, Y, Z. And suddenly, you have personal information in the spec that you never thought of. Again, in this example, it's not HR data. You're talking about technical data. And then you might have some customer information embedded in it. Right? Because you have a customer requirement coming, suddenly you have customer data in it. All unstructured, all in this format. Looking at the tag or the metadata of the document, you figure out that this problem actually exists.

Then you're starting to feed it into these knowledge graphs where, you know, everything just comes together in a vector format, and now you try to figure out controls on top of it. That's very difficult. So for us, understanding our data, understanding the ownership, understanding the redundancy of it, and trying to figure out, at a domain level, how we control the quality, how we control the compliance, how we tag the data, how we classify the data—that is the problem statement that we have started to work on. We are building teams around it. The first thing you have to do is you have to hire a data officer, which is what we did. So now we're building teams around it. We are working with the security team, working with the compliance team, and more importantly, working with the domains, right? I mean, the business units and the functional leads to have a program structure in place. And then we'll prioritize certain things too. So those things are coming. But this is where we are. We have, I would say, recognized the problem statement.

Yeah. And I think there's a couple of important lessons there. The first is this idea of managing data by domains, as you call it. We often call them entities, but we call them entity domains. This is a really important idea. Rather than trying to think about governing and managing source systems—and as you point out, it's important to look at source quality and manage that—you're thinking about managing the outcome, the resultant data. And then the second thing, if I can pull it out of what you said, is it's not considered a data task or an IT task, but it's considered a whole-company task. And you've really brought everybody together to say, for this type of data, who cares about it? Who's going to take control and ownership over it? Who's going to manage its quality and tagging, to use those examples or whatever the issues are? What I find particularly interesting in what you talk about is that for you, AI is a driver for that.

This goes to something that we've been talking a lot about. The thing that's missing in enterprise data today are nouns. What I mean by that—you use the example that you write up a product spec and in there is a company's name or a person's name. Having a system that can know that that thing is a company—that company is a customer—or that that name is a person who works at Marvell and has that definition of that noun or that entity, domain, whatever word we want to use—that's pretty unusual. That's not something you see in consumer AI, right? Because we have common understandings of nouns in the consumer world. But in enterprise, it's really important because these things have real meaning. Is that fair?

No, that is true. And I would say to a certain extent for enterprise-type companies, the awareness has not been there. In B2C companies, in these matters, they were a little bit more mature. They realized they were actually intentionally having the nouns in their content. For us, again, it's an old company. Right? We've been here for almost thirty years. This kind of sensitivity or maturity has grown over a period of time. So we're talking about a lot of content, a lot of nouns in it, using your term. It's a good term. I'm actually going to steal it. So, and then understanding the impact of it. That's huge.

Yeah. So it's clear AI is a big driver in Marvell's business overall and their strategy and roadmap. How are you seeing it affect—or maybe say, do you see it affecting—the way people have expectations about the data team and how data is delivered inside the enterprise? Are we going to replace everybody with a chat interface? How are you guys thinking about that?

Well, I don't see that even replacing a single person, quite frankly. And some of it is going to be an opinion, right? So—

Sure. Sure.

None of such technologies ever replaced people. It changed the way they function, for sure, and it's just going to scale up. There are certain things we just never did. We didn’t do them at the right pace, right? And that's what's going to change. Because these technologies, at least where they are, they're not autonomous in nature. And if any company is trying to make them autonomous, I feel they're on a very wrong path for a couple of reasons.

First is, if your technology is going to replace your employees, then you won’t have those employees to help you build the technology. Until they are in the loop, this is not going to work. That's the first thing. The feasibility of the deployment itself, if their strategy is replacing people. The second thing is, where we are now, we talk about high accuracy percentages of ninety percent, right? I mean, that's a huge accuracy in the AI world. Well, you still have ten percent wrong. So you need to have a human in the loop. And the third thing is, it's not the technology itself, but it's a mix of how your business process, your people, and this data support ecosystem come together with the technology to make business transformation happen.

So we are looking at transformation here. When you talk about replacing headcounts, when you talk about organizational efficiency, these are incremental processes, right? In the IT world, we've been automating things to do that ever since I've started working in IT. But this is something different. This is something about doing more, doing it better. And that's the approach we've taken—not to talk about employee headcount reduction but to look at how we can accelerate certain things. How can we accelerate product development? How can we make our decision-making process faster?

So that's the approach we've taken, and we took it very consciously, given where this technology is and what we need to make it more successful.

Yeah. So I think the thing that I think a lot about here is the shift from the impossible to the possible. One of the things you see in disruptive technologies in general is this—it's not about making something incrementally better; it's about something that was previously impossible now becoming something that we can do, something that's possible. And I think this is particularly true in the data context.

And to say this in a funny way, if you roll back the clock in any data organization, why haven't we cleaned up all of our enterprise data? Why is it still a mess? And it's because, as you point out, it's fundamentally a scale problem. You cannot hire enough people to clean all the data, to have it all be perfect, and then make sure no one ever changes anything or adds a new system, or God forbid something changes. And that’s obviously not realistic.

Right.

So I think very much to your point, AI is about scaling things that previously were not possible to scale and doing them in a cost-effective way.

So, shifting gears slightly, you’ve obviously been extremely successful in your career. You’ve moved up both at Marvell and at Analog before that. I just give you the opportunity: what lessons have you learned along the way that could be useful to listeners? Secret recipes.

Yeah. It’s not so secret. Actually, most of it is very obvious. So, first thing I would say, what helped me the most in my career, I always ask for help. So if I'm stuck somewhere, normally, I would just look around for people who can come in and help. And it’s actually amazing how people love doing that. They actually prefer doing that. I’ve worked in companies with various cultures. Right? Marvell is known for its amazing culture and collaborative environment, and Maxim before that is a good company. But when I was working for SAP, I had worked with a lot of customers too, and, you know, not everybody had the best culture in the past. But still, when you ask for help, when you ask them to participate in problem-solving, it’s a human tendency to do that. And that has always helped me.

So, normally, what I do is I pick up a function that is hard to deliver, and then I look for all those people who can come around and help me do it and build a team that can work together in solving that problem. And that normally results in good success. And when you have these successes lined up and everybody is rooting for your success, then normally you are successful. Right? So that has helped me. And every time I tried to, in some cases, you know, you want to be a lone wolf and do things because you think you're really fast. And I fell on my face right away, corrected course, and just got back to the thing that always worked.

So, taking the right help at the right time, collaborating with the ecosystem, with the stakeholders, with my technology partners, with vendors, and in some cases with the customers to make a goal successful—that has always helped me.

Yeah. There’s a certain sense of humility that’s important there because to ask for help is also to acknowledge that you don’t know everything. That’s certainly not a common attribute in executives, for what it’s worth. I often see the opposite behavior in executives—the “I know best, I know all” type behavior. So I do think that’s actually a really important lesson. Sorry, you were saying?

No, no, that’s—and by the way, you do have to fight your human nature to ask for help. I’m one of those people who would be losing, you know, when driving, having no idea where I am before GPS came in but still never stopping to ask somebody. So it was also work on my end to realize, "Okay, this is the right thing to do," because you’re working towards a larger goal than yourself.

So let’s say, you know, the company is driving a big business transformation or system transformation or a big acquisition. It’s not about you. Right? It’s about a larger goal than you. So that’s the perfect time to go ask for help, and people like to work on those things. And that goes into the collaboration side of it. Right? And, of course, when you’re asking for help and people are helping, the reverse has to be true too. When they ask for help, you should be ready to do that. You can’t be saying, "I’m too busy, come another day." Right? So you have to give back some too.

The second bit I would say is collaboration. But collaboration is, again, a very overused term. I would say communication along with collaboration, right? You can talk horizontally. People normally talk vertically, whether to their team or to their management—when I say normally, I mean the ones at least on the executive level. But I feel the horizontal communication is not always perfect. And that is something I work towards, and that requires us to sit with the person, understand what they do, empathize with their work. In that process, also gain some knowledge in the area so we can talk in common terms. And that has been fairly helpful in my career growth.

And I think that’s always a challenging one because, as you say, vertically there are strong incentive structures in place for those communications to be effective. When you’re working across an organization, it’s about convincing, cajoling, bringing people along, explaining why it’s in their interest to contribute to that sort of thing, which is naturally harder. And there may not be the natural incentives that are in place in a boss-subordinate type hierarchy, if that makes sense.

No. I think that those are both excellent pieces of advice. I think we are at time. So, Nishit, I really appreciate you joining us on Data Masters and sharing your thoughts on Marvell, your data journey, and what you're working on and your successes. I wish you only the best.

Awesome. Great talking to you.

Thanks for joining us for the latest episode of the Data Masters podcast. You’ll find links in the show notes to any resources mentioned on today’s show. And if you’re enjoying our podcast, please subscribe so you never miss an episode.

Experience Tamr’s proven data-centric AI engineered to speed the discovery, enrichment, and maintenance of the golden records businesses need to accelerate growth. Tamr’s expertise in quickly and accurately unifying large amounts of data across disparate sources gets results faster at a lower cost compared to traditional master data management or DIY solutions. Stop wasting time on bad data. Visit www.tamr.com. That’s tamr.com to see results now."

Suscribe to the Data Masters podcast series

Apple Podcasts
Spotify
Amazon