The Race Begins For Much Bigger Abstractions In Data Centers
Massive compute capabilities enable a whole new way of manipulating and using data, and a potential bonanza for AI data centers. The post The Race Begins For Much Bigger Abstractions In Data Centers appeared first on Semiconductor Engineering.
Data center build-out is enabling much larger and more complex abstractions.
Competition is building for digital/virtual twins across multiple industry segments, including automotive, aerospace, and chip manufacturing.
AI, and particularly AI agents, will play a significant role in sorting through data to find potential trouble spots.
The frenzy of new data center construction has opened the door to using much larger and higher levels of abstractions — massive multi-physics simulations and real-time monitoring of everything from entire cities to potential interactions between software updates in complex systems of systems.
Abstractions have a long history in the chip industry. Languages such as Python and Rust, customizable AI agents, and integrated EDA platforms that can support other vendors’ tools, are well established in system design. What’s changing is the ability to extend all of those across different data sources, sometimes even outside of their core markets, and to make changes to them as needed. That is largely a function of compute resources, and the data center build-out has removed any barriers that previously existed.
The automotive market serves as a good model for this. As carmakers migrate from discrete electronic control units (ECUs) to software-defined zonal architectures and components, it becomes more difficult for OEMs and their suppliers to differentiate in a crowded market with traditional mechanical-electrical technology. It’s no longer just about the time it takes to accelerate from 0 to 60 mph. Increasingly, the key differentiators are range per charge for BEVs and PHEVs, which OEM has the most up-to-date ADAS features or automated driving precursors, and an endless stream of electronic cabin accoutrements.
The challenge now is to make sure all these systems work as expected, individually and in concert with each other, and over time under extreme conditions. That requires a higher level of abstraction than a vehicle alone can provide at a reasonable cost. This effectively is the new demarcation line, where real-time, vehicle-specific decisions are made inside the car or truck vehicle, while multi-vehicle interactions and the integration of systems and new software are handled by data centers, increasingly leveraging some type of AI.
“Currently, we have anywhere from 60 to 120 ECUs, all from different companies, all running different software, and all expecting to plug together and work the first time,” said David Fritz, vice president of hybrid-physical and virtual systems at Siemens EDA. “We’re creating a reference design abstract where you can take it apart, learn it, and wrap teams around it. You can adjust it to be your own and go from there, replace our stack with your stack, whatever you want to do. This isn’t a new idea. It just turns out to be new for automotive.”
All of the big EDA vendors and their largest customers are pointed in the same direction — map all the possible connections in hardware and software, make it flexible enough to support new features and tools, and add enough openness so it can be used by different vendors and different combinations of components.
“Not every vehicle is exactly the same,” said Fritz. “You may use second-source parts for some, and those may have minor differences. Or maybe one person buys the exact same vehicle as somebody else, but they have a problem with some equipment and it has to be replaced two years after purchasing the car. That equipment behaves differently than the other car that didn’t need a replacement. It could be as simple as a different brake-pad depth or as sophisticated as a printed circuit board. And any of those could have an impact on the performance of a vehicle. With a digital twin, I can cycle through all the different possible combinations of what is out there on the road. All of that flows directly into a digital twin, so you can literally automate the testing for every vehicle combination in the field. And you know these things well in advance, as opposed to, ‘Let’s do an over-the-air update and keep our fingers crossed.'”
This is a natural extension of what EDA vendors have been building toward for more than a decade, gaining a much bigger foothold inside of different industry sectors as more data is collected and needs to be processed. And it provides sufficient flexibility to add future features, such as connecting to smart infrastructure whenever and wherever it becomes available. To make that formula work, chips need to be faster, more power-efficient, and more steps need to be automated to speed time to market. But most important, all of this needs to work together as one virtual system, which is why the whole chip industry is migrating toward digital and virtual twins.
The goal here is to extend these abstractions into real-time. “A digital twin lets the industry imagine new things,” said Tom De Schutter, senior vice president of product management at Synopsys. “The distinction between simulation and a digital twin is that a simulation is only used until you actually have a device. A digital twin extends beyond the device or system that actually exists. It allows you to do simulations or emulations while it’s functioning. Over-the-air updates are a great example of this, but it’s even more than that. It is continuously looking and developing the interaction of a car, for example, with its infrastructure and content. In automotive, there’s an electronics digital twin, a multi-physics digital twin, and an environmental digital twin. It’s the interaction between the three of them that becomes interesting because you want to constantly make updates.”
This isn’t confined just to automotive. The digitization of data and the connectivity of just about everything electronic can cause an almost infinite stream of potential interactions that can affect everything from performance and power to signal integrity.
“In planes, there’s a lot of turbulence on takeoff and landing,” said Anirudh Devgan, CEO of Cadence, in a recent presentation. “Traditional systems can only simulate 20% of the flight envelope. That’s why they do all these wind tunnels, which are very expensive. And in biology, only a few percent of drug discovery works. So there’s a massive need for accurate digital twins, especially in physical AI, and the corresponding silicon that’s going to drive it. Our digital twin for the data center can simulate entire data centers. This is a non-traditional product, but it has become super important these days. This is applying CFD (computational fluid dynamics) and simulation and AI to the data center. The thing to remember about data centers is it’s not only these big cloud companies that have huge data centers. There are also a lot of data centers in the enterprise. We applied it to our own data center and got 10% better power.”
Digital twins vs. virtual twins
As the market for massive real-time combinatorial analytics takes shape, naming conventions are beginning to diverge. For example, Lam Research differentiates between a digital twin, which uses a data model to make a prediction, and a virtual twin, which has a governing function overlying physical constraints or some function that is validated or calibrated with data.
“A real virtual twin often has multiple layers to it,” said David Fried, corporate vice president at Lam Research, in a recent video. “There’s not one virtual twin to the whole thing. If I want to make a virtual twin of a city, I have to think of the applications. What am I trying to solve with that virtual twin? If I want to fly a drone around a city, I really need to understand where the buildings are. So the virtual twin of the city needs to have the topography of the different buildings. However, if I’m trying to drive through the city, that information doesn’t matter. What I need to understand are the roads and the map of the roads around that city. It’s a totally different set of data and a different data structure to solve a different problem. And if I want to take public transportation like subways, it’s a different set of data altogether.”
The key here is understanding which data is needed when. “Each one of these can be a layer of exactly the same city, the same virtual twin, but I need to connect these different layers and very different data structures to one ground truth,” Fried said. “The real city is, in reality, one city, and the roads and the buildings and the subways all need to connect somehow to that reality. You can’t have a road on your street map go through a building on your topography map. There has to be some connection between these layers, but they’re different for set applications.”
Why now?
EDA vendors have been seriously looking to take on a larger role in adjacent markets for at least a dozen years, leveraging the shift toward increasingly domain-specific and heterogeneous designs. Companies have been investing heavily in data analytics, AI, and scalable multi-physics simulations. Digital and virtual twins are the culmination of much of that, using all of those capabilities to sort through interactions in data collected from a variety of sources and provide real-time, or nearly real-time, adjustments as needed.
“All of that seems possible when you have something that is very data-driven, like large AI models, which allow you to find needles in the haystack,” said Nigel Drego, CTO at Quadric. “And when you have large amounts of data, you really need to be able to find those needles. The corner cases are going to be found more quickly because you have large datasets that have been used to train, so you get the benefit of knowing a lot more about what kinds of corner cases could exist and how they might manifest.”
The ability to conflate data from multiple sources using large language models is a potential game-changer here. The LLMs can act as a kind of middleware glue for sharing different types of data. This allows carmakers, for example, to sandbox different applications from different vendors, and see how they interact, especially as systems age or as new updates are installed. And on a much larger scale, it can allow cities to monitor how different vehicles are behaving, and to automatically route those vehicles for optimum traffic flow. That can even be a self-sustaining financial model, where drivers pay more to get to their destination faster.
“The real challenge here is that complex systems beget their own kind of complexities,” Drego said. “And so the challenge will be that, yes, these complex systems might be able to solve some of these corner cases, identify them, and create solutions for them. But what other kinds of problems are these complex systems creating for themselves? There will be a lot of back and forth here, and the question will be whether we can narrow down the space in which these models work so that we understand the complexities to the point that we can control it, if necessary. That’s going to be the challenge. Can we control a complex system that can itself control complex systems?”
New role for AI agents
Much of that will depend on the evolution of AI agents, which will play an increasing role in digital twins because they can continuously search massive amounts of data for anomalies. This is particularly important in manufacturing, where variation can impact long-term reliability.
“AI agents are actually the best way to implement digital twins,” said William Wang, CEO of ChipAgents. “The concept is very abstract, while AI agents are very practical. You’re running the tools and your existing methodology using AI agents. We see that happening in the industry. A year ago, a lot of companies were trying to build a copilot inside some of these tools, but then they realized you couldn’t really accomplish much. But you can use AI agents to do more complicated tasks. They can run regressions, do root cause analysis, and debugging that can run for a long time. So the future is having AI agents running in the background.”
This is starting to be deployed inside chip manufacturing as a way of improving yield, which means the chips are within spec and more likely to be reliable in the field. Where digital twins fit into the picture is the ability to virtualize front-end and back-end processes to get a much larger view of what’s happening and where variation is beginning to creep in.
“There are another 20 layers or so in the fab after you have the GDSII, and then they actually start the process,” Wang said. “The fab basically buys one machine from each of these major manufacturers, and then they have all the data there, but it’s not connected because the people in the rooms are chemical engineers and physicists. There are no computer scientists in the room. So how you would make sense of that data and be able to connect the data with the design side will be very important. And all of that has to be connected with testing and post-silicon testing. This will fundamentally change the industry, allowing us to look at the default parameters and optimize those with AI agents to get better default parameters for all the factory processes.”
Similarly, in automotive, all the pieces need to come together, and AI can help make sense of the data compiled for these mega abstractions. “There are so many people still struggling with the concept of software-defined vehicles, but in reality it should be AI-defined vehicles,” said Siemens’ Fritz, “What’s really interesting is the agentic part of this is yet another abstraction over the top of a service-oriented architecture.”
Conclusion
How this plays out remains to be seen, particularly as edge AI takes root. That will define what gets processed locally, on or near the edge device, and what role or shape a digital/virtual twin ultimately takes.
“It probably will become a kind of software abstraction layer in the end,” said Synopsys’ De Schutter. “It’s how you allow for customization of specific tasks. There needs to be some software abstraction where you say, ‘Okay, this software makes sure that whatever I run on it still optimizes it for the underlying compute.'”
That will require massive amounts of compute power, as well as a fair level of trust in the technology. “To some extent, it’s how willing we are to give control to a bigger orchestration to allow for optimizations,” said De Schutter. “Abstraction at the city level is already happening in pieces. If you had asked people 30 years ago if it would be okay to share all of that information to a database that anyone could access, everybody would have said no. But that’s exactly what social media is. You’re basically sharing your information in a public way in a database that can be accessed by everyone. The abstraction involves how comfortable we feel sharing versus the value we get out of this. As long as the value is safer cities or faster commutes to work, or sleeping while the car drives itself, then that abstraction and our willingness to let go of certain principles that we had before is going to evolve.”
Related Articles
Secure Data Sharing Becoming Critical For Chip Manufacturing
Driven by a plethora of benefits, data sharing is gradually becoming a “must have” for advanced device nodes and multi-die assemblies.
Digital Twins Find Their Footing In IC Manufacturing
Technology will speed time to yield and add efficiency, but standards are needed for it to live up to its potential.
EDA Looks Beyond Chips
System design, large-scale simulations, and AI/ML could open multi-trillion-dollar markets for tools, methodologies, and services.
The post The Race Begins For Much Bigger Abstractions In Data Centers appeared first on Semiconductor Engineering.