Federated Learning Demands More Programmable Hardware

Artificial intelligence itself has been a disruptive technology. Federated learning is an approach to AI that might end up disrupting AI itself. At the École Polytechnique Fédérale de Lausanne (EPFL), David Atienza, professor and director of Embedded Systems Laboratory, is spearheading an initiative to develop the technique, an alternative to the traditional centralized machine learning (ML).

The key difference is that federated learning trains a certain ML algorithm across multiple decentralized edge devices or servers holding local data samples without exchanging them. It reduces the concerns on data privacy and energy consumption due to data movement to the cloud or a central repository.

Federated learning is emerging as one of the technological responses to the global crisis of coronavirus pandemic. During patient analysis and testing, according to the World Health Organization (WHO), there is evidence that almost two-thirds of patients have dry coughing. Doctors can detect such patients when they enter the emergency ward, and this observation amounts to nearly 90 percent of positive cases.

Atienza’s team at EPFL in Lausanne, Switzerland, is experimenting with an AI-enabled system that can combine data from multiple different devices used to observe coughing and significantly improve the accuracy of the initial diagnosis. Engineers can correlate the disparate data they have acquired from coughing samples and eventually create a more sophisticated model.

Atienza told EE Times that this AI project is based on the concept of federated learning; the technology enables designers to take a bunch of sensors and task them to observe particular characteristics. Here, while they won’t be able to fully observe the assigned objects, these sensors can help create a simplified model that can be combined with other similar models, and subsequently, produce a full-fledged AI model.

A New AI Chapter
The federated learning approach, introduced by Google in 2017, provides a viable alternative to the traditional machine learning designs performed in centralized data centers with abundant compute power. That, in turn, opens up the path for building ML models for startups and small to mid-sized companies that otherwise can’t afford to own large data centers.

Moreover, unlike machine learning models being trained on large data centers, where data privacy is a major concern, this new framework facilitates local governance of data, thus ensuring the information privacy on localized devices.

While elaborating on the concept of federated learning, Atienza said that when people think of AI, they think of a central assistance system that collects data for training the AI models. Federated learning, meanwhile, allows Internet of things (IoT) devices to be far more adaptive and learn from small datasets located at multiple sites.

Federated learning working in a healthcare environment (Image: Carnegie)

Atienza added that when distributed sensors capture the data, they take this data to train models, not in a standalone mode, but via a central common point that serves as a repository for data exchange. So, while learning takes place inside the sensors, the central repository or server helps sensors synchronize what they observe and create a common model.

“Conversely, in a fully distributed learning, there is a no repository that federates devices and synchronizes them,” Atienza said. “So, devices like sensors cannot talk to each other and coordinate without a central unit.”

Additionally, while such fully distributed systems are way more complicated, federated learning can help solve many problems and offer good results. In clinical environments, for instance, hospitals can collaborate on the development of AI models without directly sharing sensitive patient data.

AI disrupter in the making
Federated learning opens a brand-new computing paradigm in the AI realm. In it, engineers can perform local training of neural networks and find the right data for effective training. That, according to Atienza, will provide a big boost to machine learning applications such as natural language processing and document processing.

“A greater focus of AI initiatives is now toward the tasks that humans aren’t doing well,” he said. Here, designers can use the available data and cluster that into datasets to serve particular algorithms. After classifying the data, they can create better and personalized machine learning models.

David Atienza

The decentralized training approach in federated learning has the potential to disrupt the current AI designs built around powerful data centers that not only raise privacy concerns, but also consume a lot of power and cause latency in data response when everything is sent to the cloud.

However, while federated learning is promising to take AI beyond large data centers and cloud computing infrastructure, what about the hardware that is going to cater to this new machine learning technology? Atienza says that hardware solutions for federated learning must be more versatile compared to AI accelerators currently serving specific machine-learning applications.

Programming mini accelerators
Atienza specifically mentioned the coarse grained reconfigurable array (CGRA) solutions that act like a bunch of mini-accelerators and can be dynamically configured for a particular use case. “The fact that CGRA solutions retain the basic architecture means that designers can use the same chip for different AI applications while engineering the chip’s programmable pieces.”

The CGRA solution serving a confederated learning application (Image: EPFL)

The CGRA mesh promises a high degree of flexibility by providing reconfigurable solutions via a wide variety of computational kernels. The technology allows the accelerators to be programmable at the operational level, unlike FPGAs, which bring a considerable power overhead because of bit-level reconfigurable arrays.

The CGRA architectures, like FPGA designs, are structured as a two-dimensional mesh of reconfigurable cells (RCs) that are tightly interconnected. However, unlike FPGAs, which can only provide bit-level flexibility, the functionality of CGRCs is defined at the operation level.

Atienza says that there will be more and more configurability in hardware serving new technologies like confederated learning so that designers can perform different types of AI tasks by programming chips according to the current trends. “The applications like federated learning are likely to drive the need for AI hardware that can be reprogrammed using a large number of mini-accelerators,” he concluded.

Original article

— Majeed Ahmad, Executive Editor at Electronic Design News (EDN), has covered electronics design industry for more than two decades. He holds Masters’ degree in telecommunication engineering from Eindhoven University of Technology. He has worked in various editorial positions, including assignments for EE Times Asia and Electronic Products.