AI models with trillions of parameters are quickly becoming the norm, pushing the boundaries of what’s possible in artificial intelligence. However, while these powerful models hold promise for numerous industries, the challenge of running such large models at the edge of networks—where real-time performance and low latency are critical—remains a significant obstacle.
How can massive AI models operate efficiently in distributed environments without overwhelming the network? This is the “shrinking giant” dilemma.
In this report, Techopedia speaks with experts from IBM and NTT to explore solutions for deploying AI at the edge in a way that balances power and efficiency.
Key Points:
- Large AI models are often too resource-intensive for edge devices.
- Smaller, more efficient AI models are key to making AI work at the edge.
- Sectors like manufacturing need practical AI solutions that can function in decentralized environments.
- Successful edge AI deployment relies on breaking down silos and collaboration, which is beginning to happen today.
Edge Infrastructure and Smaller AI Models: A Practical Solution?
As computing power shifts from data centers to the edge, edge computing is expected to revolutionize operations across various industries. Global spending on edge computing is projected to hit $232 billion by 2024, a 15.4% increase over 2023, largely driven by AI.
Techopedia spoke with Paul Bloudoff, Senior Director of Edge Services at NTT DATA, who emphasized the role of smaller, more efficient AI models in edge environments. These models are designed to be use-case specific, making them easier to deploy and manage.
“When we look at bringing AI to the edge, we need to understand the business use case and what organizations are trying to accomplish. Not every deployment will need monolithic AI models to support real-life use cases,” said Bloudoff.
He explained that lightweight AI models running on edge devices can benefit IT and Operational Technology (OT) teams in industries such as manufacturing. For example, on factory floors, AI can integrate data from IoT devices to track machine vibration, temperature, and wear, helping to enhance predictive maintenance and reduce costly downtime.
“To accomplish this (use case), organizations will not need a million-dollar AI platform to process this kind of information,” added Bloudoff. “For many organizations, the resource investment to bring monolithic AI solutions to the edge is too complex and costly.”
TinyAI Models: The Future of AI at the Edge
Researchers are advocating for “TinyAI” models as the solution to the challenge of transitioning massive AI models from data centers to edge environments. TinyAI models are purpose-built for specific use cases, making them more cost-effective and energy-efficient—an approach that aligns with sustainability goals in industries such as healthcare, agriculture, and urban development.
Nick Fuller, Vice President of AI and Automation at IBM Research, told Techopedia that for edge workloads requiring low latency, on-device inferencing is not just desirable, but essential.
“To this end, serving ‘small’ foundation models on such devices (robotic arms, mobile devices, cameras, etc.) facilitates on-device inferencing within the budget (memory, compute, accelerator) constraints of such devices,” said Fuller.
While larger models can still be trained in the cloud or on-premises, lightweight models are increasingly appealing for edge deployments where latency and real-time decision-making are critical.
Overcoming Edge Computing Challenges
One of the main challenges of edge infrastructure is the limited computing power, data storage, and memory available at the edge. Paul Bloudoff of NTT explained that organizations often struggle with communication silos between machines and devices across their networks, especially when different manufacturers’ equipment is involved.
“Think of it as different individuals speaking different languages who are all providing data that needs to be collected, analyzed, processed, and transformed into action,” said Bloudoff.
NTT’s Edge AI platform addresses this issue by using a software layer that automatically discovers devices and unifies data from across an organization’s IT and OT environments. This allows for a comprehensive diagnostic report that AI-powered solutions can act on, helping to break down silos and connect stakeholders across different teams.
“Once you break down silos and connect with stakeholders across different teams, you will have a better understanding of your processing power and memory needs,” Bloudoff added. “Bigger isn’t always better.”
IBM Fellow Catherine Crawford echoed these sentiments, highlighting the need for smaller AI models in edge environments.
“There are multiple existing use cases that can leverage edge using smaller AI models where the technical challenges still exist and assets can be developed for distributed, secure Edge to Cloud continuous development AIOps,” she said.
Crawford emphasized that ongoing research is focused on creating task-tuned, smaller AI models for edge use cases, taking into account constraints such as compute power, memory, storage, and battery life.
“For instance, perhaps having multiple, tuned smaller models on-board devices with hierarchical inferencing algorithms will be sufficient for edge use cases. AI algorithms and corresponding models will continue to evolve, and with the focus in our industry on sustainability, these models will quickly evolve to take fewer resources, becoming more appropriate for edge systems.”
AI Built for the Edge
According to the 2023 Edge Advantage Report, which surveyed 600 enterprises across various industries, about 70% of organizations use edge solutions to address business challenges, but nearly 40% worry that their current infrastructure won’t support more advanced solutions.
Bloudoff acknowledged this concern, explaining that NTT is focused on deploying smaller, more efficient language models that can run in real time without requiring expensive hardware updates. By focusing on tiny models, businesses can maximize the edge computing power they already have in place.
NTT is moving forward with dozens of proofs of concept across industries such as manufacturing, automotive, healthcare, power generation, and logistics, demonstrating the potential of TinyAI models in real-world edge environments.
The Bottom Line
Supercomputers, quantum computers, and high-powered AI models housed in massive data centers are driving innovation at the highest levels. However, the real-world impact of AI happens at the edge.
From the smartphone in your pocket to the machines that automate factories and manage healthcare systems, edge networks connect us all. As larger, more powerful AI models emerge, companies like NTT and IBM are betting on smaller, more efficient models that will bring AI to the edge—where it truly matters.
By shrinking these AI giants, the future of edge computing is not only more sustainable but also more accessible for businesses worldwide.