Hardware leading the AI revolution

Hardware leading the AI revolution

New: Infrastructure is strategic again

The heady cloud-computing highs of assumed unlimited access are giving way to a resource-constrained era. After being relegated to a utility for years, enterprise infrastructure (for example, PCs) is once again strategic. Specifically, specialized hardware will likely be crucial to three significant areas of AI growth: AI-embedded devices and the Internet of Things, data centers, and advanced physical robotics. While the impact on robotics may occur over the next few years, as we discuss in the next section, we anticipate that enterprises will be grappling with decisions about the first two areas over the next 18 to 24 months. While AI scarcity and demand persist, the following areas may differentiate leaders from laggards.

Edge footprint

By 2025, more than 50% of data could be generated by edge devices.23 As NPUs proliferate, more and more devices could be equipped to run AI models without relying on the cloud. This is especially true as generative AI model providers opt for creating smaller, more efficient models for specific tasks, as discussed in “What’s next for AI?” With quicker response times, decreased costs, and greater privacy controls, hybrid computing (that is, a mix of cloud and on-device AI workloads) could be a must-have for many enterprises, and hardware manufacturers are betting on it.24

According to Dell Technologies’ Mohindra, processing AI at the edge is one of the best ways to handle the vast amounts of data required. “When you consider latency, network resources, and just sheer volume, moving data to a centralized compute location is inefficient, ineffective, and not secure,” he says. “It’s better to bring AI to the data, rather than bring the data to AI.”25

One major bank predicts that AI PCs will account for more than 40% of PC shipments in 2026.26 Similarly, nearly 15% of 2024 smartphone shipments are predicted to be capable of running LLMs or image-generation models.27 Alex Thatcher, senior director of AI PC experiences and cloud clients at HP, believes that the refresh in devices will be akin to the major transition from command-line inputs to graphical user interfaces that changed PCs in the 1990s. “The software has fundamentally changed, replete with different tools and ways of collaborating,” he says. “You need hardware that can accelerate that change and make it easier for enterprises to create and deliver AI solutions.”28 Finally, Apple and Microsoft have also fueled the impending hardware refresh by embedding AI into their devices this year.29

As choices proliferate, good governance will be crucial, and enterprises have to ask the question: How many of our people need to be armed with next-generation devices? Chip manufacturers are in a race to improve AI horsepower,30 but enterprise customers can’t afford to refresh their entire edge footprint with each new advancement. Instead, they should develop a strategy for tiered adoption where these devices can have the most impact.

Build versus buy

For buying or renting specialized hardware, organizations may typically consider their cost model over time, the expected time frame of use, and the necessity for progress. However, AI is applying another level of competitive pressure to this decision. With hardware like GPUs still scarce and the market clamoring for AI updates from all organizations, many companies have been tempted to rent as much computing power as possible.

Organizations may struggle to take advantage of AI if they don’t have their data enablement in order. Rather than scrambling for GPUs, it may be more efficient to understand where the organization is ready for AI. Some areas may concern private or sensitive data; investing in NPUs can keep those workloads offline, while others may be fine for the cloud. Thanks to the lessons of cloud in the past decade, enterprises know that the cost of runaway models operating on runaway hardware can quickly balloon.31 Pushing these costs to operating expenditure may not be the best answer.

Some estimates even say that GPUs are underutilized.32 Thatcher believes enterprise GPU utilization is only 15% to 20%, a problem that HP is addressing through new, efficient methods: “We’ve enabled every HP workstation to share its AI resources across our enterprise. Imagine the ability to search for idle GPUs and use them to run your workloads. We’re seeing up to a sevenfold improvement in on-demand computing acceleration, and this could soon be industry standard.“33

In addition, the market for AI resources on the cloud is ever-changing. For instance, concerns around AI sovereignty are increasing globally.34 While companies around the world approved running their e-commerce platforms or websites on American cloud servers, the applicability of AI to national intelligence and data management makes some hesitant to place AI workloads overseas. This opens up a market for new national AI cloud providers or private cloud players.35 GPU-as-a-service computing startups are an alternative to hyperscalers.36 This means that the market for renting compute power may soon be more fragmented, which could give enterprise customers more options.

Finally, AI may be top of mind for the next two years, but today’s build versus buy decisions could have impacts beyond AI considerations. Enterprises may soon consider using quantum computing for the next generation of cryptography (especially as AI ingests and transmits more sensitive data), optimization, and simulation, as we discuss in “The new math.” 

Data center sustainability

Much has been said about the energy use of data centers running large AI models. Major bank reports have questioned whether we have the infrastructure to meet AI demand.37 The daily power usage of major chatbots has been equated to the daily consumption of nearly 180,000 US households.38 In short, AI requires unprecedented resources from data centers, and aging power grids are likely not up to the task. While many companies may be worried about getting their hands on AI chips like GPUs to run workloads, sustainability may well be a bigger issue.

Currently, multiple advancements that aim to make AI more sustainable are underway. Enterprises should take note of advancements in these areas over the next two years when considering data centers for AI (figure 2):

  • Renewable sources: Pressure is mounting on the providers of data centers and AI-over-the-cloud to find sustainable energy sources—and the rapidly growing focus on AI may help transition the overall economy to renewables.39 Major tech companies are already exploring partnerships with nuclear energy providers.40 Online translation service DeepL hosts a data center in Iceland that’s cooled by the naturally frigid air and is fully powered by geothermal and hydroelectric power.41 And in El Salvador, companies are even exploring how they could power data centers with volcanos.42
  • Sustainability applications: While building AI consumes a lot of energy, applying AI can, in many cases, offset some of these carbon costs. AI is already being used to map and track deforestation, melting icebergs, and severe weather patterns. It can also help companies track their emissions and be more efficient in using data centers.43
  • Hardware improvements: New GPUs and NPUs have already saved energy and cost for enterprises. Innovation is not stalling. Intel and Global Foundries recently unveiled new chips that can use light, rather than electricity, to transmit data.44 This could revolutionize data centers, enabling reduced latency, more distributed construction, and improved reliability. While this fiber optic approach is expensive now, costs may come down over the next couple of years, enabling this type of chip to become mainstream.

link

Leave a Reply

Your email address will not be published. Required fields are marked *