Episode 392 - Todd Westra / Moshe Tanach


00:28 Hey, welcome back to another episode. We are so glad to have you here today because we are talking to Moshe who's gonna tell us all about what he's doing with his AI implementation. Moshe, tell us what's going on. Who are you and what do you do with your business?

00:44 Hi Todd, thanks for having me. So my name is Moshe Tanach. I'm husband to Edith, my soulmate wife and father to three amazing grownup kids that I consider my friends and mentors. I'm the co-founder and CEO of New Reality, in which we're transforming how companies deploy AI inferencing into their products and services. I've been in the semiconductor business for the last 25 years. Fortunate enough to spend roughly half in very successful startups like Designert Networks where we built a 4G base station on a chip and got acquired by Qualcomm. On the other side, beyond that, big corporates like Intel and Marvell where I headed the Wi-Fi division in Intel and right before founding Neurality, I was the head of product definition and architecture in the network division of Marvell.

01:44 Wow, that's a big deal.

01:45 I think startups and corporates are very different in their pace, focus, persistency, resilience. Um, and I'm grateful to, to being, uh, to, to have the opportunity to be admitted to both, both of these schools, the startup world and the corporates.

02:03 Right. That's not normal. Most people that go corporate, they're corporate forever. And most people that are startup, they end up, a lot of them do end up either having a massive success and they never go back to corporate or they just keep developing for corporate. So tell us how you fit in the mix here. You've had some amazing exits it sounds like and been part of some really cool startups. This is a pretty massive thing that you're trying to do here. Tell us what you're doing. How are you incorporating the AI into the semiconductor experience that you've got and building a business out of this? Who's your client?

02:42 So our clients is any customer that owns a data center or use a data center. You can look at the big hyperscalers that lead the world in terms of AI deployment. And you can look on a software as a service company, a SaaS company using AI as a service on the cloud or a big bank or insurance company that owns his own data center. So all of these that are deploying AI are natural customers. We're focusing on the AI inference. It's the process that leverage trained models that we train to classify images or translate language or like Chetchi PT, it leveraged those trained model and deploy them in a real application. So the inference is the more complex in terms of compute, but the cost pressure is much higher because this is your cost of sale. This is where your margins come from. It's not like the training where it's R&D spending and you're developing something. So it has its own budgets. And when you look into...

03:58 So a lot of people are going to be confused here. A lot of people are going to be confused because I would say everyone outside of knowing how servers operate and how the resources of a server actually operate. I mean, a lot of people are so much familiar with like an AWS execution, right? Like they, I just built a tool based on AWS and all that they really understand is as my business grows, I need more servers, right? And what you're saying is that servers aren't, that's not the only thing you need. You actually have to have an intelligent way of utilizing the resources of those servers before you're able to fully execute something like an AI component or an AI application. Is that what you're saying?

04:44 Exactly. You're absolutely right. I think when you when you look on all the AI users in the world today, you have the early development cycle, the exploration phase. And this is where you use AWS or Azure or any other platform that allows you to very fast, turn around, train your model, embed it in your line of application and try start using. When you get to scale, this is where you care about. This is where your COO is knocking on your door and telling you, hey, dude, you just consumed $10 million of resource in the last quarter. What are we doing about it? And this is where R&D groups, data scientists, deep learning engineers, IT are getting into the details of where the compute consumption is. Why am I paying so much on networking or on storage? And you start to optimize that. So in your reality, we looked at the inference problem that starts in the virtual clients that wants to leverage the service and ends in the server itself and how it is built. And we kind of looked on where the efficiency, where we're losing efficiency and how can we give those customers an easy way to deploy. But it's tough. It's tough to optimize your use case. You need all sorts of engineers. And if we can give them easy to use APIs that can optimize it and move them to use a different piece of hardware that is transparent to them, then we can bring a lot of value. And the value will be in two angles. One, you can allow the hyperscalers to scale better and lower the power consumption and invest more in sustainability. But you can also open the door to companies that don't use AI today because it's too complex and it's too costly. So this is where we are focused.

07:08 Well, half of it, I mean, half of it comes from over focusing on having to train something new, right? And what you're doing is you're coming in with a pre-trained protocol of how to analyze what resources are being used and then automatically engage in moving different server structures based off the actual demand. Is that kind of what you're up to?

07:30 Um, uh, close. 

07:31 Close? Help me out. Yeah.

07:32 Um, it's more looking on the server and how it is built today. Um, when you build an AI inference server, you leverage, uh, three technologies. You leverage the main CPU from Intel or AMD. That is the host of the server. And then you have a couple of GPUs or ASIC that were designed to do the neural net processing, the deep learning processing. And the CPU is the host of those DLAs, those deep learning accelerators. And when you send, let's assume this is a backend server of Alexa. And we'll talk to your device at home. Your voice is being sent to the cloud. And then there's a backend server there that is doing voice recognition or natural language processing. So the CPU is the one getting the request from your home. And then it hands the deep learning processing to the GPU. And when you serve thousands or hundreds of thousands of requests in the server every hour, then the CPU is becoming a bottleneck. So not only that you get a lot of money and power consumption in the CPU that hosts those GPUs for the deep learning, the CPU is actually bottlenecking the system and you're under utilizing your GPU. So you lost on or capital expense because you just wasted money on silicon that is not being consumed. And you also spend a lot of power consumption and overhead of cost. So you can introduce a new system architecture, just like the one we invented, and practically make the GPU and network attached device. Connect the GPU directly to the network hosting it with the CPU. And that's what we do. We have a new processor that we call NAPU, Network Addressable Processing Unit, and it's a low cost, very high efficiency host for those deep learning accelerators. So in a sense, if I give you a layman word and analogy, assume you take your Toyota Camry and you wanna race with it. So you take a Formula One engine, put it in your Camry and you go out to the street and you hit the gas and the car falls apart. So we just reinvented the system around the deep learning processing processors so you can utilize them to 100% and you can lower the overheads that surrounds those deep learning processors so you can leverage them better.

10:18 I gotcha, I gotcha. Okay, this makes a lot of sense. And so your real client avatars are hosting companies, people that are developing in a large scale on Google Cloud, AWS, any of these places, and your, or are your clients the actual data centers themselves hosting these machines that are processing all the information?

10:41 So what you see today in the market is that most of the servers are being deployed in the cloud data centers or in enterprise data centers. Take Bank of America or JP Morgan or Visa MasterCard or a health care company or a government company. They have data centers. Data privacy regulations doesn't allow them to take some of it to the cloud. So any customer that invest in AI, embed it in his digital products or services, can buy servers from us, or can use our servers that are deployed in the cloud.

11:26 Gotcha, 

12:52 Smart, smart. This is a huge problem you're solving. I mean, this is a massive, massive thing. And as more people are developing technology use cases with AI, it sounds like they're gonna need it even more. So now that we have an idea of what your business is and who you're serving, tell us a little bit about the journey to this point. I mean, what have you, I can only imagine that this is a massive debt heavy start to building this system because you're producing hardware and you're producing basically firmware around that hardware to implement more efficient use case. How is this going? Like, how do you start something this big and find a path to revenue?

13:42 So as always, it starts from understanding that there is a problem. Back in the days, 2018, 2019, where the deployment of AI started to scale, I was still in Marvell, and we were analyzing how to embed deep learning capabilities into our products. And I saw how the evolution of all these AI services and servers are being handled. Obviously, the GPU from Nvidia won most of the deployment because it was the only solution ready and much better than a CPU. But when you look on how the system is built, it was obvious that there's a lot of waste. And a lot of those things we saw already in storage 15 years ago, when we started with the direct attack storage, every server had his own hard drives, and then we moved to network attack storage to to improve the efficiency. We added network protocols that are specialized for it. So I saw that the AI are walking along the same footsteps. And it was obvious where it would go and what would be the barriers, at least to me and my partners. So this is how we started. Back in 2020, when we started the company and we raised the first money,we immediately worked on a prototype based on FPGAs that can prove to all these hyperscalers and enterprise companies and ecosystem partners like the OEMs, Lenovo, Dell, HPE, that this is a viable solution. So we built it in the first nine months. We only raised about seven million at the beginning. We recruited the team and now three years later, we're about to launch our first chip that we taped out a few months ago. 

15:53 It's amazing.

15:54 So the road is tough for a semiconductor company. This is harder for investors to understand. It's not like a Waze application that navigates the streets. It's Deep Tech, it's fundamental infrastructure. But when you build the prototype, when you show it to the real customers, when you get the feedback, then the investors start to understand this is a real thing, you've been validated in the market. And then we raised the second round of, raised more than 35 million last year. Tough year, so when you identify the right problem and you get it validated in the market, then it becomes a bit easier to raise the second round, third round but it's definitely a tough journey. And on the technology side, it's all also graphed because when we built the first prototype, we understood all the things that we didn't think of at the beginning. So the architecture had been evolving in the first 18 months based on the prototype feedback and everything. 

17:08 That's cool.

17:10 So it takes good, talented engineers and funding.

17:13 Well, it does, but the returns are going to be enormous, assuming you get even one data center to rebuild their systems with your chip. It's going to make all the difference in the world, right?

17:28 It is. It is. And you know, it's not only running the compute and the server. It's not only the cost and the power consumption. Some of the customers tells us that they have a real estate problem. Today, if you look at a rack of servers, most of the racks in the world can support up to 12, 13 kilowatts. Today when you take three servers and you can deploy six or seven servers for you in one rack. But when you buy these very expensive high power consumption NVIDIA based servers, you can only deploy three of them and then you're capped your power consumption. So they're finding themselves empty racks, just adding more and more racks and they're running out of room space. So they have to buy another room, another data center, not to even talk about the amount of carbon emission that is wasted just because you're not using an advanced technology, an advanced architecture.

18:42  Well, and longevity of the hardware is a big function as well. I mean, you're able to stabilize the use of the resources so that it does last longer, I would imagine. I mean, you've got an amazing product here. So let's go back to the business side of this though, because I'm really curious. A lot of people listening would never dare to go this deep into a program or a product before seeing revenue like you have. This is a lot of risk. And I think that as people think of growth and scaling, a lot of them, that means I've got product, I've got client acquisition, I've got stuff rolling within the first few months that I launched my business. And you're saying that three years into it, almost four, before you actually see the fruition of all of your work and your energy and your research and your and your product market fit and your analyzation. I mean, that is a lot of work to go in before actual product is developed and sent out. What do you have to do in your mind to make that justify in your head, we're raising 7 million, 35 million, and likely as soon as production starts, we're going to need another 100 million to sustain it. That's awesome.

20:04 Yeah. Well, you just spooked another set of investors from investing in your reality. No, you know, laugh aside. Um, I think you're, you're on the spot here. Semiconductor, um, entrepreneurship is, is tougher because the time to revenue is longer. And, uh, that's why it's so critical to build a prototype. Uh, that's why a lot of the, uh, groups that are trying to build a company, a semiconductor company, are not being able to raise money. I think when you come with the right people, with the experience, when you propose to do something that is disruptive, but you thought about all the angles. In our case, if you come to Google and you propose to them a new type of hardware that mandates changing all the software that they have done, you have zero chance and sell them. But when you come with a business case and with a technology roadmap that covers that part, that you're gonna be a seamless integration. You're gonna support exactly the framework that they use. You're building a Kubernetes native device. Then the investors bring their tech guys that check you.

that do the due diligence and validate it. And I think this was the story around Neurality with Zvika being the VP of Backend in Melanox and Yossi that led the SmartNIC DPU in Melanox and Lior Hermos, our CTO that was a PMC fellow. And before that was in a couple of successful startups. So it's not a bunch of kids 25 year old with an idea of application. People with 20, 25 years of experience in building chips and complex systems that are coming and putting their record on the table and giving up the big salaries we got in the big corporate.

22:18 I can only imagine. Yeah, I can only imagine this. This is a really, everyone's investing heavy in this. Assuming that the technology works, what does the return look like? Because this is not the normal type of business profile we tend to have on the show, but this is enormous. I mean, what does a typical revenue cycle look like for you and how does it get implemented? 

22:44 So return can be huge, really, really huge. This is why you see all these valuations around AI semiconductor companies. If you look up, you'll find the numbers. But you have to understand, when you look on ChetGPT, for instance, so about two, three months ago, it was reported to consume about $2 million of compute every day. So we're talking about $700 million of spending on compute. This is before adding on top of it margins to sell it as a software as a service. So there's a lot of money around AI and new markets are being invented as we speak around generative AI, LLMs, computer vision, recommendation engines. For us, a product line based on NAPUs for AI can get to 200, 400, $600 million revenue, annual recurring revenue in a matter of three to four years. And if this is going to be the choice of hardware for the ultra-scale applications, such as Siri, Alexa, recommendation engine in Meta, or then it can even surpass it by far and become a multi-billion revenue annually.

24:16  Wow, well, I hope you get there. This is fascinating. I honestly love what you've shared with us because it is definitely outside the normal box that we hear about of a problem needs to be solved. We come up with a solution. We get people paying for it in a matter of months. I love this long tail startup venture. You don't hear about too many semiconductor startups anymore because a lot of that is owned by such massive organizations that they do the R&D themselves and they're trying to improve themselves. But how hard has it been? What's been the biggest challenge for you in trying to gain that trust from these people to say, hey, you know what? This is actually really gonna help us.

25:04 I think that the biggest challenge was, you know, to pitch it to enough people until you find the one that is brave enough to go the long marathon instead of the faster turnaround. Most of the seed investors, they don't look that far and they are trying to generate the, you know, small ARR revenue. And then based on that, do secondaries. I think we just, it took us time to find the right partners to join us. And thankfully we have a list of investors such as Ezra Gardner from Varana Capital that is supporting us in financial roadmap, Gonzalo from Cardamom, Cleveland Avenue. And we have some corporate VCs such as the Samsung and SK Hynix that are backing us. I think the start was very tough. Once you have a couple of good investors in the boat with you, then the second and the third become easier, as long as you execute and you deliver on your promise. And this is exactly what we've been doing.

26:21 And how do you build roadmaps for them to wanna see? I mean, it feels like such a daunting task. Building the roadmap for your investors to look at and say, okay, yeah, that makes sense. Was that hard for you to do, to develop that?

26:34 Not much because this is a complex program, but not something that we haven't done before in previous companies. And you build the technology roadmap and you build the business roadmap. And we've been working hard on the business side. When you look on the value chain of a semiconductor companies like us, usually we don't sell directly to the hyperscalers or to the enterprise. There are the OEMs.in the middle like HPE, Dell, Lenovo, Cisco and others. You have system integrators or value-add resellers like Wesco, WWT. So you need to educate them. So you do the demand creation, working directly with the end customers, but eventually they will buy through the ecosystem. So if you look back, you'll see that we've been active in the last two years building this ecosystem. So when the chip is ready, everything is already, all the relationships are established, the business models, the rev share agreements, etc. And if you think forward and you show investors around plan for technology, for go to market, then they trust you and they follow you and they help you fine tune it.

28:00 Love it, love it. Well, this has been a fascinating conversation, Moshe. I think there's so much we could talk about. We can get really nerdy on this, but I think I'm going to let that go today because really what we've been able to discuss is very valuable to a lot of people in the same boat of growth and scaling. Sometimes it doesn't happen overnight. Sometimes it takes some time. You've got to develop the strategy and the execution plan so that your clients your investors, your family understand that, hey, we're putting all this energy because here is the mother goal. Here's where we're really trying to get to. So I appreciate you sharing this all with us. Is there somebody who's been kind of in your network, in your corner, who's been able to either mentor or inspire you to keep on moving through this long game that you're playing here?

28:50 This is a tough question. There've been a couple of people on the way that gave me the power. But I think my early days when I was a kid, I lost my father when I was 12. And I think it put me into a situation that you're on your own. It's not that I didn't have my mother and sister, but you're on your own. You don't have your father guidance anymore. And you understand at early stage that things can go south. And it created like a mentality for me that everything is possible if you just sit down, draw the problem statement and insist on finding the solution. And the journey in your reality is exactly like that. There are days that you feel that you're stuck whether you're fundraising or you have tough issues on the technology side that are pushing out your schedule. And it's a downer, it makes it very, very tough. But the persistency, resilience are the things that insist on finding the solution, talking to your people and looking at reality in the white of the eye, as we say in Hebrew is what bring me to, what bring us to solve the problem and overcome the next barrier.

30:43 Well, Moshe, I appreciate you sharing that with us. Honestly, this is a fun conversation because it's different than the normal one that we usually have. And I hope you the best of success. I really feel like this is a, given the backing and the support you've received from your community, I can't wait to see this thing grow and see where you're at in 2024 with the growth and development of this company. So thanks so much for taking the time to be here with us today.

31:04 Thank you very much for having me and good luck to all the entrepreneurs out there.

31:10 All right, thank you so much. And we'll catch the rest of you on the next episode.

2024 The Growth and Scaling Podcast, Inc. All Rights Reserved.