November 28, 2024
October 11, 2024
November 28, 2024
October 11, 2024
Optimize compute, storage and data
Choose copilot or autopilot execution
Continuously improve with reinforcement learning
At autocon, we hosted a panel talk on transforming operations with AI & Autonomous Systems. The panelists shared insights based on their diverse experiences, emphasizing the necessity for a strategic approach to integrating autonomous systems within existing infrastructures. Here are the top themes:
The panel comprised of some of the brightest minds in IT infrastructure, including:
Tim (Sierra Ventures): How should anybody who is just coming into the notion of autonomous be thinking about the readiness equation of their infrastructure?
Shibu (Geodis): A transformation doesn't come from a vacuum. It has to be thought out end to end. When we started our journey of automation, AI processes were there, but our first question was “Are we ready to take it and reap the full benefit of it?”
We also had another equation in the bundle: Where will our money be invested? Our core business is supply chain optimization. In the supply chain, automation is a first class citizen as you have robots picking and packing. Since automation was in our DNA, we knew we had to do automation in this area because only then the customer gets the full value.
We quickly realized that we were not mature enough to adopt and get the benefit. We could invest in it because it's a newer and cooler technology but that dollar spent would be a waste.
Talking with people in Sedai helped us uncover some of the things we are not ready for. We quickly realized that we had to invest here, either by partnering with people who were already into it, or by building it ourselves. That was a quick realization for us to make sure we were ready.
Rachit (Paylocity): As humans, it's easier for us to think in terms of framework because it helps us think about what is the journey and where do we want to be.
The car industry came up with this beautiful framework: L0 to L5. It tells you where you are in terms of maturity, i.e., L0 where you have no driver assistance. L1 is where you start with assistance, get partial assistance, conditional assistance, full assistance, and then go autonomous, which will be your L5.
So what do you do? You:
Mo (GSK): GSK adopted cloud. Three things were very important to us: cost, performance, and security. Taking that together and doing the right sizing is the biggest challenge we see today.
People said they can solve the problem for us and that's why we are here.
Jigar (Sisu): I have a perspective I want to share and this is like my journey over three different companies.
When PayPal was going through the transformation, we had people staring at screens, worrying about every machine. We had to take some time because it was not just the technology change; it was also the cultural change where we had to get people along with us, and not just abandon them. That was my PayPal journey on how to become ready. not just from a technology perspective, but also from a people and culture perspective.
The mindset in Facebook was completely different. We were doubling in size in terms of machines, and I'm talking about millions of machines every year. So, readiness was not a word in our dictionary. You better be ready because machines are coming.
Then the very first day when we started building the system in Sisu, we had autonomy as a principle because we didn't have enough people to build a system that can be looked upon by folks standing at screens. So the autonomous system was built from the ground up with things like Sedai that you can start using on day one even as a small startup.
Subha (Wipro): The assessment of readiness is absolutely critical. As an example, one of our clients is a medical device manufacturer based out of Japan who we manage data centers in infrastructure for.
Introducing Sedai would not even be an option for us because they are all bare metal. They are sitting in their own data centers and are not even virtualized in most cases.
Tim (Sierra Ventures): Where was KnowBe4 on that maturation journey because I was quite impressed with how quickly you made the decision to deploy and start getting end-to-end value with Sedai.
Matt (KnowBe4): It was something that we had to be very deliberate about in order to achieve. I started at KnowBe4 in 2018. At that time, most of our software was running on EC2 instances, including databases and compute. In many cases, a single server processed and provided a lot of what we deliver to our customers. Our job as the SRE team was to clean that up while the Amazon bill was still four or five figures a month.
If we hadn't gone through that journey, it would be significantly harder now because our Amazon commit next year is millions and millions of dollars.
It was easy for us to implement Sedai because we were strong users of IaC and it only took us months to sign up with Sedai and get from 0% to 100% with them.
Tim (Sierra Ventures):There's no agreed upon framework of the maturation of a company to assess if it can start adopting automation. Does the industry need a framework that quickly tests the readiness to adopt automation? If yes, whose job is it to define this framework?
Subha (Wipro): It is hard to standardize a framework because of how fragmented the stack, the usage, the implications and the applications are. I think it has to be generic enough, but it won't then be solving the problem. It has to be coming from the customer and in consultation with somebody like Sedai, who has an understanding of how the system works.
Rachit (Paylocity): In technology, it's less about “what” and more “how”. The how's are pretty standard as we don't have a lot of options.
When you walk into a data center and if you're naming your host with IPS or specific names, you know whether maturity is right. The next step usually is to automate this part. Once you automate that part, you graduate to the next level. That is how the framework would be agnostic to what industry you're from or the outcome you're looking for.
Matt (KnowBe4): I think the barrier even goes back to the introduction of centralized logging and collection of data from these decentralized systems. I like the point that Sedai is an accelerant because you could get from L0 or L1 to L5 using some carefully tailored bash scripting. I almost want to introduce this idea of L6 where you have an AI-driven system that discovers things engineers or humans may have never even thought of.
I don't think that KnowBe4 is at L6 yet. In some cases, we're not even at L5. The places where we're using Sedai are much more advanced than the places where we're not. It feels like almost a new tier. That's been a cool journey for us this year, and we're looking forward to how much stuff we can get to L5 and beyond..
Shibu (Geodis): We talk about institutions that are software and technology oriented. But for example, there is no place for Ansible in a PLC or a conveyor system. I cannot bring up a conveyor system by running a script. So it depends upon the industry as well.
In every industry, there is a story for tools like Sedai. So that's where the perspective of maturity comes into play. We cannot just define that maturity or the framework by looking at a technology powerhouse like Google or eBays.
Tim (Sierra Ventures): What is the right approach to implement autonomous systems? We know the benefits but how did you mitigate risk?
Matt (KnowBe4):- If you're completely risk averse, you will be stuck on a lower level of autonomy. It is just a matter of taking a low risk instead of a high risk.
In our case, a lot of the building blocks were already in place. Our infrastructure was well defined by Terraform and already centralized modules. We knew 90% of our compute was being delivered by a handful of Terraform modules. That made it really easy for us to plug into that. They were also pulling the latest version of our module, so we didn't have to go through hundreds or thousands of repos and update to a new pinned version of that module. We were already taking risks by trying to be closer to the edge.
If you are looking to implement more automation, find places where you can approach the edge and implement Sedai or other tools like it. If they had problems, you could roll back quickly, tolerate a bit of an issue or down time if it were to happen.
Tim (Sierra Ventures): One way to mitigate risk is by starting with 20% autonomous, and scaling your way up. I don't exactly recall, but KnowBe4 went 100% auto very quickly.
Matt (KnowBe4): We did once we were ready. You know, and we didn't start at 0 and go to 100 overnight.
We tailor picked a service that we knew would get some good utilization in production. Even as a beta feature, we had hundreds of customers testing and using this feature while we had Sedai enabled on it. Sedai was enabled throughout the entire process of building this new feature.
Even the engineers working on that service didn't know that it was happening. We moved on from there and turned it on for all our development environments after we had seen a production service go through an entire release cycle for weeks with only cost savings and no issues.
When nobody asked what happened to the service, we felt pretty confident to open the floodgates.
Tim (Sierra Ventures): How is Pharmaland thinking about the journey and the implementation?
Mo: For us, the most important thing was realizing we couldn't achieve our goals while in a data center. So, cloud adoption became our highest priority. We adopted Azure and GCP.
We began by adopting API and IaC - everything that’s stack driven - whether it was faster drug discovery or implementing supply chain solutions. The third part was how to sell faster with market data.
If you put appropriate guardrails, have appropriate people who can manage and operate the technology really well and understand the business, that's how you mitigate the risk. We have built guardrails. We start off with Dev environment, and move to non-prod and then production.
We still haven't gone fully auto but we hope to get there in the next couple of years.
Jigar (Sisu): At Facebook, our systems were autonomous. That means, somebody was able to push a change to the entire network and we would be disconnected from the internet for several hours. So blast radius with this level of automation is pretty high. That’s why you need to have enough guardrails and treat infrastructure code as “code”. If you are developing an application, you will not push your code to production without testing.
Tim (Sierra Ventures): What are some non-financial gains that you were able to trap?
Rachit (Paylocity): When it comes to innovation, especially in the autonomous space, we are at a precipice where we have the right tools and environment; we just need the right actors now.
We saw a similar story at Netflix around 2013, 2014, and 2015 when the culture in the industry was divided into development and operations. Development handled the build, while operations took care of deployment and infrastructure maintenance. Netflix came along and said, “This does not work for me. I want to move faster.”
So it built systems that helped people deploy more and more artifacts to production. The outcome was a 6000% increase in experimentation. They went from doing two, three, or four experiments a month to over a 1,000 experiments a day. As a result, people became hooked on Netflix. They loved Netflix not because someone really smart was sitting behind the screens figuring out what buttons to push or what movies to display, but because an autonomous system was making decisions about what could move forward and what could not.
Similarly, developers felt more comfortable rolling out pull requests (PRs). Every single PR was ready for production. If it was not ready, the system would block it and say, “Nope, you're not ready.” That was an autonomous system making decisions.
If you implement an autonomous system that helps you determine the right things to do, your customers will be happier. Your people will also be happier because they won't have to focus on mundane tasks; they can focus on more intellectually demanding and context-driven work.
On top of that, it frees up time for dependent teams. Companies that start to embrace autonomy now will see more innovation and disruption. They will be able to move faster because this is how R&D allocations work. There are companies out there with over $100 million in R&D, where 80% to 90% goes toward running the business. They are spending almost nothing on innovation. Doing so helps unlock those dollars and redirect them to actual growth, not just keeping the business alive.
Subha (Wipro): We have 250,000 employees globally, and a substantial portion of our costs goes into this employee base. In addition to the $12 million we generate from services, we have $450 million in annual recurring revenue from platforms.
To address these challenges, we had to create constraints or "starve the RTB" (Run the Business). This has led to ruthless prioritization of our RTB efforts. The savings we achieve are then reinvested into our internal core, which we refer to as our “core AI platform business”. Essentially, this is a generative AI platform we are orchestrating across multiple models, including some that are being developed by our R&D team. These models are tailored for specific tasks; for instance, some are more effective at text-to-voice conversion, while others excel in image processing.
We recently conducted a beta release with over 5,000 employees, and alongside the RTB reduction, we are also driving additional gains in other use cases, starting with HR. While this discussion may not focus on autonomy and infrastructure, it is related in principle.
For example, a significant portion of our costs was tied to background checks and hiring processes. Previously, it would take us 7 to 10 days to conduct background checks. Now, thanks to these improvements, it only takes a couple of hours. This increase in productivity not only reduces the time required to onboard new employees but also lowers the overall cost of onboarding. This is just one example of the additional savings we are achieving.
Jigar (Sisu):
Automation can also serve as a business enabler. Complex business tasks can be solved using automation. At my current startup, a company in Europe said that “We need deployment in London, or we are not actually boarding on your platform.” Because we had automation, we could actually spin up a new instance just for them and serve them there.
There are many examples where your investment in automation can help grow business and not just save cost.
Tim (Sierra Ventures): Give ideas on where you think technology like Sedai can go. What is that L6 you talked about?
Matt (KnowBe4): As a customer, I would love some solutions for my ever-increasing CloudFront spend, which just keeps going up every time we gain more customers. The same goes for Aurora, RDS, and S3 spend, where these unbounded or provisioned environments continue to grow as you acquire more customers.
I've defined a specific backend data store to be a specific size, and changing that means down time for my customers. Once you push the limits of the compute resources in the cloud, it becomes very difficult to manage. This is going to require more creative solutions, not all of which are immediately obvious. If you sit down and think about how you would address this with RDS, it presents a challenge.
Shibu (Geodis): The L6 would be to take tools like Sedai to the edge. i.e., A scaled-down Sedai that looks at only a few automation signals. It may require restarting some services before it happens.
For example, if a conveyor goes down, it takes two or three hours to bring it back. How can we reduce that to the edge? We just need L2, so that someone can spark the battery again and get things going. That will give us more benefit if a conveyor goes down, the entire employee base will stay put. That's a big cost.
Subha (Wipro): In use cases like Sedai and infrastructure, you need higher precision. Sedai can grow significantly by creating LLM-like or transformer-like models for the infrastructure space, depending on the kinds of data you see.
Tim (Sierra Ventures): In an LLM and GenAI world, infra stacks are going to be rethought. Workloads are CPU-GPU hybrids. What are the autonomous opportunities in this realm?
Jigar (Sisu): There is a lot to be done. If you look at a typical GenAI lifecycle, there are three phases. One phase is data cleaning and data preparation, and I was so happy to see that Sedai is going to handle data platforms because it's a significant part of the cost, and there are a lot of optimization opportunities in just data prep space.
The second part of this is how you train the models. Whether you're using a generative AI model, such as an LLM that is available to you or open source, or you're developing your own model, training is super expensive. You are working with thousands of GPUs, or using thousands of GPUs in the cloud, which is also super expensive. The way we utilize resources for training models is probably a decade behind how we use production resources. Techniques and optimizations have not been applied to optimize GPU usage.
Then, the last phase is how you serve it. Inference is a massive cost. There is a different cost between GPT-3.5 and GPT-4. There is a significant opportunity to trim down the models so that you don’t have to serve these giant models for inference. This presents an opportunity that I generally refer to as the MLOps space, which includes everything from data preparation to training the model to serving the model. Sedai has the potential to become a billion-dollar business by addressing this new wave of developments.
October 11, 2024
November 28, 2024
At autocon, we hosted a panel talk on transforming operations with AI & Autonomous Systems. The panelists shared insights based on their diverse experiences, emphasizing the necessity for a strategic approach to integrating autonomous systems within existing infrastructures. Here are the top themes:
The panel comprised of some of the brightest minds in IT infrastructure, including:
Tim (Sierra Ventures): How should anybody who is just coming into the notion of autonomous be thinking about the readiness equation of their infrastructure?
Shibu (Geodis): A transformation doesn't come from a vacuum. It has to be thought out end to end. When we started our journey of automation, AI processes were there, but our first question was “Are we ready to take it and reap the full benefit of it?”
We also had another equation in the bundle: Where will our money be invested? Our core business is supply chain optimization. In the supply chain, automation is a first class citizen as you have robots picking and packing. Since automation was in our DNA, we knew we had to do automation in this area because only then the customer gets the full value.
We quickly realized that we were not mature enough to adopt and get the benefit. We could invest in it because it's a newer and cooler technology but that dollar spent would be a waste.
Talking with people in Sedai helped us uncover some of the things we are not ready for. We quickly realized that we had to invest here, either by partnering with people who were already into it, or by building it ourselves. That was a quick realization for us to make sure we were ready.
Rachit (Paylocity): As humans, it's easier for us to think in terms of framework because it helps us think about what is the journey and where do we want to be.
The car industry came up with this beautiful framework: L0 to L5. It tells you where you are in terms of maturity, i.e., L0 where you have no driver assistance. L1 is where you start with assistance, get partial assistance, conditional assistance, full assistance, and then go autonomous, which will be your L5.
So what do you do? You:
Mo (GSK): GSK adopted cloud. Three things were very important to us: cost, performance, and security. Taking that together and doing the right sizing is the biggest challenge we see today.
People said they can solve the problem for us and that's why we are here.
Jigar (Sisu): I have a perspective I want to share and this is like my journey over three different companies.
When PayPal was going through the transformation, we had people staring at screens, worrying about every machine. We had to take some time because it was not just the technology change; it was also the cultural change where we had to get people along with us, and not just abandon them. That was my PayPal journey on how to become ready. not just from a technology perspective, but also from a people and culture perspective.
The mindset in Facebook was completely different. We were doubling in size in terms of machines, and I'm talking about millions of machines every year. So, readiness was not a word in our dictionary. You better be ready because machines are coming.
Then the very first day when we started building the system in Sisu, we had autonomy as a principle because we didn't have enough people to build a system that can be looked upon by folks standing at screens. So the autonomous system was built from the ground up with things like Sedai that you can start using on day one even as a small startup.
Subha (Wipro): The assessment of readiness is absolutely critical. As an example, one of our clients is a medical device manufacturer based out of Japan who we manage data centers in infrastructure for.
Introducing Sedai would not even be an option for us because they are all bare metal. They are sitting in their own data centers and are not even virtualized in most cases.
Tim (Sierra Ventures): Where was KnowBe4 on that maturation journey because I was quite impressed with how quickly you made the decision to deploy and start getting end-to-end value with Sedai.
Matt (KnowBe4): It was something that we had to be very deliberate about in order to achieve. I started at KnowBe4 in 2018. At that time, most of our software was running on EC2 instances, including databases and compute. In many cases, a single server processed and provided a lot of what we deliver to our customers. Our job as the SRE team was to clean that up while the Amazon bill was still four or five figures a month.
If we hadn't gone through that journey, it would be significantly harder now because our Amazon commit next year is millions and millions of dollars.
It was easy for us to implement Sedai because we were strong users of IaC and it only took us months to sign up with Sedai and get from 0% to 100% with them.
Tim (Sierra Ventures):There's no agreed upon framework of the maturation of a company to assess if it can start adopting automation. Does the industry need a framework that quickly tests the readiness to adopt automation? If yes, whose job is it to define this framework?
Subha (Wipro): It is hard to standardize a framework because of how fragmented the stack, the usage, the implications and the applications are. I think it has to be generic enough, but it won't then be solving the problem. It has to be coming from the customer and in consultation with somebody like Sedai, who has an understanding of how the system works.
Rachit (Paylocity): In technology, it's less about “what” and more “how”. The how's are pretty standard as we don't have a lot of options.
When you walk into a data center and if you're naming your host with IPS or specific names, you know whether maturity is right. The next step usually is to automate this part. Once you automate that part, you graduate to the next level. That is how the framework would be agnostic to what industry you're from or the outcome you're looking for.
Matt (KnowBe4): I think the barrier even goes back to the introduction of centralized logging and collection of data from these decentralized systems. I like the point that Sedai is an accelerant because you could get from L0 or L1 to L5 using some carefully tailored bash scripting. I almost want to introduce this idea of L6 where you have an AI-driven system that discovers things engineers or humans may have never even thought of.
I don't think that KnowBe4 is at L6 yet. In some cases, we're not even at L5. The places where we're using Sedai are much more advanced than the places where we're not. It feels like almost a new tier. That's been a cool journey for us this year, and we're looking forward to how much stuff we can get to L5 and beyond..
Shibu (Geodis): We talk about institutions that are software and technology oriented. But for example, there is no place for Ansible in a PLC or a conveyor system. I cannot bring up a conveyor system by running a script. So it depends upon the industry as well.
In every industry, there is a story for tools like Sedai. So that's where the perspective of maturity comes into play. We cannot just define that maturity or the framework by looking at a technology powerhouse like Google or eBays.
Tim (Sierra Ventures): What is the right approach to implement autonomous systems? We know the benefits but how did you mitigate risk?
Matt (KnowBe4):- If you're completely risk averse, you will be stuck on a lower level of autonomy. It is just a matter of taking a low risk instead of a high risk.
In our case, a lot of the building blocks were already in place. Our infrastructure was well defined by Terraform and already centralized modules. We knew 90% of our compute was being delivered by a handful of Terraform modules. That made it really easy for us to plug into that. They were also pulling the latest version of our module, so we didn't have to go through hundreds or thousands of repos and update to a new pinned version of that module. We were already taking risks by trying to be closer to the edge.
If you are looking to implement more automation, find places where you can approach the edge and implement Sedai or other tools like it. If they had problems, you could roll back quickly, tolerate a bit of an issue or down time if it were to happen.
Tim (Sierra Ventures): One way to mitigate risk is by starting with 20% autonomous, and scaling your way up. I don't exactly recall, but KnowBe4 went 100% auto very quickly.
Matt (KnowBe4): We did once we were ready. You know, and we didn't start at 0 and go to 100 overnight.
We tailor picked a service that we knew would get some good utilization in production. Even as a beta feature, we had hundreds of customers testing and using this feature while we had Sedai enabled on it. Sedai was enabled throughout the entire process of building this new feature.
Even the engineers working on that service didn't know that it was happening. We moved on from there and turned it on for all our development environments after we had seen a production service go through an entire release cycle for weeks with only cost savings and no issues.
When nobody asked what happened to the service, we felt pretty confident to open the floodgates.
Tim (Sierra Ventures): How is Pharmaland thinking about the journey and the implementation?
Mo: For us, the most important thing was realizing we couldn't achieve our goals while in a data center. So, cloud adoption became our highest priority. We adopted Azure and GCP.
We began by adopting API and IaC - everything that’s stack driven - whether it was faster drug discovery or implementing supply chain solutions. The third part was how to sell faster with market data.
If you put appropriate guardrails, have appropriate people who can manage and operate the technology really well and understand the business, that's how you mitigate the risk. We have built guardrails. We start off with Dev environment, and move to non-prod and then production.
We still haven't gone fully auto but we hope to get there in the next couple of years.
Jigar (Sisu): At Facebook, our systems were autonomous. That means, somebody was able to push a change to the entire network and we would be disconnected from the internet for several hours. So blast radius with this level of automation is pretty high. That’s why you need to have enough guardrails and treat infrastructure code as “code”. If you are developing an application, you will not push your code to production without testing.
Tim (Sierra Ventures): What are some non-financial gains that you were able to trap?
Rachit (Paylocity): When it comes to innovation, especially in the autonomous space, we are at a precipice where we have the right tools and environment; we just need the right actors now.
We saw a similar story at Netflix around 2013, 2014, and 2015 when the culture in the industry was divided into development and operations. Development handled the build, while operations took care of deployment and infrastructure maintenance. Netflix came along and said, “This does not work for me. I want to move faster.”
So it built systems that helped people deploy more and more artifacts to production. The outcome was a 6000% increase in experimentation. They went from doing two, three, or four experiments a month to over a 1,000 experiments a day. As a result, people became hooked on Netflix. They loved Netflix not because someone really smart was sitting behind the screens figuring out what buttons to push or what movies to display, but because an autonomous system was making decisions about what could move forward and what could not.
Similarly, developers felt more comfortable rolling out pull requests (PRs). Every single PR was ready for production. If it was not ready, the system would block it and say, “Nope, you're not ready.” That was an autonomous system making decisions.
If you implement an autonomous system that helps you determine the right things to do, your customers will be happier. Your people will also be happier because they won't have to focus on mundane tasks; they can focus on more intellectually demanding and context-driven work.
On top of that, it frees up time for dependent teams. Companies that start to embrace autonomy now will see more innovation and disruption. They will be able to move faster because this is how R&D allocations work. There are companies out there with over $100 million in R&D, where 80% to 90% goes toward running the business. They are spending almost nothing on innovation. Doing so helps unlock those dollars and redirect them to actual growth, not just keeping the business alive.
Subha (Wipro): We have 250,000 employees globally, and a substantial portion of our costs goes into this employee base. In addition to the $12 million we generate from services, we have $450 million in annual recurring revenue from platforms.
To address these challenges, we had to create constraints or "starve the RTB" (Run the Business). This has led to ruthless prioritization of our RTB efforts. The savings we achieve are then reinvested into our internal core, which we refer to as our “core AI platform business”. Essentially, this is a generative AI platform we are orchestrating across multiple models, including some that are being developed by our R&D team. These models are tailored for specific tasks; for instance, some are more effective at text-to-voice conversion, while others excel in image processing.
We recently conducted a beta release with over 5,000 employees, and alongside the RTB reduction, we are also driving additional gains in other use cases, starting with HR. While this discussion may not focus on autonomy and infrastructure, it is related in principle.
For example, a significant portion of our costs was tied to background checks and hiring processes. Previously, it would take us 7 to 10 days to conduct background checks. Now, thanks to these improvements, it only takes a couple of hours. This increase in productivity not only reduces the time required to onboard new employees but also lowers the overall cost of onboarding. This is just one example of the additional savings we are achieving.
Jigar (Sisu):
Automation can also serve as a business enabler. Complex business tasks can be solved using automation. At my current startup, a company in Europe said that “We need deployment in London, or we are not actually boarding on your platform.” Because we had automation, we could actually spin up a new instance just for them and serve them there.
There are many examples where your investment in automation can help grow business and not just save cost.
Tim (Sierra Ventures): Give ideas on where you think technology like Sedai can go. What is that L6 you talked about?
Matt (KnowBe4): As a customer, I would love some solutions for my ever-increasing CloudFront spend, which just keeps going up every time we gain more customers. The same goes for Aurora, RDS, and S3 spend, where these unbounded or provisioned environments continue to grow as you acquire more customers.
I've defined a specific backend data store to be a specific size, and changing that means down time for my customers. Once you push the limits of the compute resources in the cloud, it becomes very difficult to manage. This is going to require more creative solutions, not all of which are immediately obvious. If you sit down and think about how you would address this with RDS, it presents a challenge.
Shibu (Geodis): The L6 would be to take tools like Sedai to the edge. i.e., A scaled-down Sedai that looks at only a few automation signals. It may require restarting some services before it happens.
For example, if a conveyor goes down, it takes two or three hours to bring it back. How can we reduce that to the edge? We just need L2, so that someone can spark the battery again and get things going. That will give us more benefit if a conveyor goes down, the entire employee base will stay put. That's a big cost.
Subha (Wipro): In use cases like Sedai and infrastructure, you need higher precision. Sedai can grow significantly by creating LLM-like or transformer-like models for the infrastructure space, depending on the kinds of data you see.
Tim (Sierra Ventures): In an LLM and GenAI world, infra stacks are going to be rethought. Workloads are CPU-GPU hybrids. What are the autonomous opportunities in this realm?
Jigar (Sisu): There is a lot to be done. If you look at a typical GenAI lifecycle, there are three phases. One phase is data cleaning and data preparation, and I was so happy to see that Sedai is going to handle data platforms because it's a significant part of the cost, and there are a lot of optimization opportunities in just data prep space.
The second part of this is how you train the models. Whether you're using a generative AI model, such as an LLM that is available to you or open source, or you're developing your own model, training is super expensive. You are working with thousands of GPUs, or using thousands of GPUs in the cloud, which is also super expensive. The way we utilize resources for training models is probably a decade behind how we use production resources. Techniques and optimizations have not been applied to optimize GPU usage.
Then, the last phase is how you serve it. Inference is a massive cost. There is a different cost between GPT-3.5 and GPT-4. There is a significant opportunity to trim down the models so that you don’t have to serve these giant models for inference. This presents an opportunity that I generally refer to as the MLOps space, which includes everything from data preparation to training the model to serving the model. Sedai has the potential to become a billion-dollar business by addressing this new wave of developments.