New hardware architecture provides advantage in AI computation

0

As AI applications become more widespread, more computations need to be performed – and more efficiently with reduced power consumption – on local devices rather than geographically distant data centers in order to overcome data lags. frustrating response. A group of engineers from the University of Tokyo have tested for the first time the use of ferroelectric materials based on hafnium oxide for reservoir physics – a type of neural network that maps data on systems physical and can achieve precisely such an advance – on a voice recognition application.

They described their findings in a paper presented at the 2022 IEEE Hybrid Symposium on VLSI Technology and Circuits, held in Honolulu, Hawaii, June 12-17.

The development of artificial intelligence (AI) technology and its myriad applications has exploded in recent years, but a major obstacle to its further deployment comes from the colossal cost of calculations and energy consumption, in particular when these calculations are performed by software whose physical location resides in the data. centers at a considerable distance from the user.

Even with data traveling over networks at the speed of light, there can be delays of a fraction of a second or more between a user’s request and the delivery of an application’s response. This is due to the large distances, as the photons travel thousands of miles from the user to the data center, sometimes half a world away, and then back again. For consumer applications, from video games to voice assistants, this small delay can be frustrating, but for mission-critical applications in government, from healthcare to defense, such delays – known as latency – can cost lives.

Computer scientists and engineers are focusing on two lines of attack to overcome this challenge: transferring at least some of the required computation from software to hardware, and from centralized data centers, or cloud, to a local device.

The first strategy is necessary because it makes no sense to attempt efficiencies only in the programs you run and not also in the machines they run on. The second strategy, known as edge computing, reduces latency because there is simply less distance for data to travel. When your smartphone performs the calculations involved in a biometric verification (and not the data center some distance away), this is an example of how edge computing disperses the calculations from the cloud to the device.

Lately, physical reservoir computing (PRC) – in which efficiencies are achieved in local device hardware – has attracted a lot of attention from engineering researchers for its ability to advance these two lines of attack. PRC is a consequence of the development of recurrent neural networks (RNN), a type of machine learning well suited to processing data over time (temporal data) rather than static data. Indeed, RNNs take into account information from previous inputs to consider a current input (thus “recurring”), and from there, the output. Because of this ability to process temporal data, RNNs are suitable for applications whose conclusions (or inferences) are sensitive to data sequence or temporal context, such as speech recognition, natural language processing, or language translation. language, and used by applications such as Google Translate or Siri.

In physical reservoir computing, input data is mapped to patterns in a physical system or reservoir (such as patterns in the structure of a magnetic material, a photon system, or a mechanical device), which enjoys a higher dimensional space than the entrance. (A piece of paper is a space that has one dimension greater than a piece of string, and a box still has one dimension more than the piece of paper.) Then a pattern analysis is performed on the spatio-temporal patterns on the final “layer” reading to understand the condition of the tank. Since the AI ​​is not trained on recurrent connections within the tank, but only on reading, simpler learning algorithms are feasible, greatly reducing the computation required, enabling high-speed learning and reducing energy consumption.

Engineers at the University of Tokyo had previously designed a new PRC architecture that uses ferroelectric gate transistors (FeFETs) made of ferroelectric materials based on hafnium oxide. Most people are familiar with ferromagnetism, in which an iron magnet is permanently magnetized in a particular polar direction (one part of the magnet becomes its “north” and the other end its “south”). Ferroelectricity involves an analogous phenomenon in which certain materials – in this case hafnium oxide and zirconium oxide – experience electrical polarization (a shift of positive and negative electrical charge) which can then be reversed by the application of an external electric field. This switchable bias can thus store memory like any transistor. By 2020, the researchers had also demonstrated that a basic reservoir calculation operation was possible using these materials.

“These materials are already commonly used in semiconductor integrated circuit manufacturing processes,” said Shinichi Takagi, co-author of the paper and a professor in the University’s Department of Electrical Engineering and Information Systems. from Tokyo. “This means that FeFET tanks should be integrated into large-scale semiconductor integrated circuit fabrication with little difficulty compared to newer material.”

While ferroelectric materials based on hafnium oxide had received a lot of attention in the semiconductor industry due to their ferroelectricity, the type of applications for which FeFET-based physical reservoir computing was suitable and its performance in real applications had not yet been studied.

After proving the feasibility of their PRC architecture two years ago, the researchers then tested it on a speech recognition application. They found it to be 95.9% accurate for speech recognition of numbers from zero to nine. This proved for the first time the ease of use of the technology in a real application.

The researchers now want to see if they can increase the computational performance of their FeFET tanks, as well as test them in other applications.

Ultimately, the researchers hope to demonstrate that an AI chip with the ferroelectric PRC architecture based on hafnium oxide can achieve a high level of performance in terms of extremely low power consumption and processing in real-time compared to conventional computational methods and AI hardware.

/Public release. This material from the original organization/authors may be ad hoc in nature, edited for clarity, style and length. The views and opinions expressed are those of the author or authors. See in full here.
Share.

About Author

Comments are closed.