Artificial intelligence is now going mainstream, thanks in part to the recent explosive growth of generative AI. Trending public platforms like ChatGPT and Google Bard have opened the AI door to all, automatically generating all kinds of new content, even computer program code. In parallel, enterprises also see opportunities for business advantage by running their own AI applications on their own systems.
AI can indeed help you leverage the data in your organization to improve business outcomes. But your AI workloads must also run with the right performance and at the right price. This article explains why AI applications are challenging for data centers, which types of data centers are suited for which AI workloads, and how you can move smoothly to AI-driven business.
Power-Hungry AI Processors
AI applications must often compute many data points simultaneously to achieve results. Matrix calculations for this are better handled by GPUs (graphical processing units), rather than CPUs (central processing unit) with a traditional architecture. However, GPUs are also power-hungry. Where a system based on standard CPUs might not exceed a power draw of 5kw, a system with GPUs may need a power supply of 30kw or more. Thus, compared to other business computing, AI training and inferencing activities require more compute power, more electrical power, and higher performance cooling solutions.
The machine learning and neural networks behind much of AI have been around for some time, at least when measured in Internet years. So far, AI applications mostly made predictions based on patterns they identified in datasets. Now, generative AI can use such patterns to create new content. Text, images, music, video, computer code, generative AI can do a lot. But such omnipotence comes at a cost. The number of parameters for neural networks driving generative AI applications can be billions or more. Data center power requirements rise accordingly.
The AI Development-Production Divide
According to their position in the AI system lifecycle, different AI workloads have different data center profiles. In the first part of the cycle, AI applications must be trained on data, often in huge quantities. The AI development workloads for training need intensive compute power and high bandwidth access to data sources, either locally or remotely. They may however tolerate variation in availability.
Ideally, the data for AI training is collocated with the application, to avoid energy-intensive data transmission. In practice, AI development workloads may ingest data from sources on the Internet or from datastores at other sites. Hybrid approaches can help enterprises to achieve specific data security and confidentiality compliance, for example, by running AI applications externally to meet power needs but holding the data on their own premises.
By comparison, different requirements appear for AI production workloads that come later in the cycle and that are destined to provide payback for the business. Not only do they need compute power, but they also need low-latency data access, high availability, and especially for generative AI, robust throughput for large numbers of users who may be widely distributed geographically.
Core and Edge Computing for AI
The different facets of AI development and production workloads can be managed by organizing them into core and edge computing activities.
AI development workloads may be better hosted in facilities with substantial power, flexibility, and scalability, such as colocation centers. These data center facilities, like those offered by eStruxture, correspond to a core computing model. Power supplies per cabinet are available at 30kw or more, while remaining environmentally friendly. High performance cooling systems include air and liquid cooling, with constant monitoring and adjustment. Data center interconnect (DCI) ensures high-speed, low-latency transmission for accessing different data sources.
On the other hand, AI production systems may offer better results by being more local to the groups of users they serve or devices they manage. This lets them process data with lower delay by minimizing its transmission. In this edge computing model, the AI applications can also filter and pre-process data to send to central platforms, thus making more efficient use of networks.
Recent advances in AI models enable a device as small as a smartphone to become an edge device running AI applications locally for a user, and communicating with central AI platforms as needed. However, if large numbers of users access a platform simultaneously, for example, for generative AI, a centrally located cluster in a data center with ample power can be a better solution.
Easing into the New AI Era
While AI is increasingly popular, it has not replaced other business computing workloads. Daily transaction processing, scheduling, inventory management, and accounting are examples. Typically, these more traditional workloads run on systems with CPUs and lower power requirements. Indeed, CPUs may even be preferred to GPUs in such cases. Data center flexibility is therefore important to handle both AI and non-AI workloads with their separate power and cooling needs.
Data center privacy and security continue to be key aspects. Software and data protection policies and solutions for non-AI applications can be applied to AI applications as well. Robust physical security in colocation data centers can protect both existing and new systems with private cages and round-the-clock video monitoring, with colocation provider guarantees for regulatory compliance.
How Will AI Workloads Evolve in the Future?
While generative AI solutions may currently serve people rather than other systems, machines will increasingly access generative AI. Expectations will continue to rise for faster response times and high-speed, low-latency data transmission. In general, AI workloads will increase in number and size, as enterprises think of more ways to use AI.
Server manufacturers will make servers less power-hungry, including using new solutions for AI with CPU architectures, decreasing dependence on GPUs. AI models and applications will also gain in efficiency, but not necessarily at the rate of workload increase.
Overall, enterprises and organizations will continue to seek extra data center space, power, and cooling to expand AI usage to meet business needs and opportunities. Colocation centers like those of eStruxture offer these resources for customers to achieve greater AI performance and lower costs.
With 15 facilities across Canada totaling more than 760,000 square feet of combined data center space and an overall IT capacity of 130 megawatts, eStruxture’s data centers offer both central and edge locations that are highly connected, scalable, secure, flexible, and sustainably designed.
Contact eStruxture today and find out more about our solutions to help you run AI and other workloads successfully, now and for the future.