How to implement a voice user interface on resource-constrained MCUs
across a wide range of consumer, industrial, and IoT applications. They are supported by easy-to-use design tools, making it relatively straightforward to build a simple VUI without extensive coding experience or in-house expertise. The choice of a particular RA family MCU primarily comes down to the complexity of commands and the Cyberon library’s size. A smart light switch, which requires a modest command set and limited computing power to operate effectively, could be based on the R7FA4W1AD2CNG from the RA4 family. This MCU has a battery- friendly 48-megahertz (MHz) Arm Cortex-M4 core supported by 512 Kbytes of flash memory and 96 Kbytes of SRAM. It features a segment LCD controller, a capacitive touch sensing unit, Bluetooth Low Energy (Bluetooth LE) wireless connectivity, USB 2.0 Full-Speed, a 14-bit analog-to- digital converter (ADC), a 12-bit digital-to-analog converter (DAC), plus security and safety features (Figure 5). A more extensive Cyberon DSpotter library and a more powerful core are needed for an application such as a smart speaker. A suitable candidate is the R7FA6M4AF3CFM. This MCU from the RA6 family
features the more powerful 200 MHz Arm Cortex-M33 core supported by 1 megabyte (Mbyte) of flash memory and 256 Kbytes of SRAM. It has a CAN bus, Ethernet, I²C, LIN bus, a capacitive touch sensing unit, and many other interfaces and peripherals. The RA4 and RA6 families are supported by evaluation boards, the RTK7EKA4W1S00000BJ and the RTK7EKA6M4S00001BE, respectively, to allow a developer to exercise the MCUs’ capabilities. Each evaluation board has the target MCU and an onboard debugger. Renesas also offers a VUI solution kit to accelerate development. The kit is similar to the evaluation boards in that it incorporates the target device and debuggers. The board also features several I/O interfaces and has four microphones: two analog and two digital. Access to the software needed for development with the VUI solution kit is available on Cyberon’s website. This includes complimentary Cyberon DSpotter Modeling Tool access and features an e2 studio project with a working voice CommandSet (e2 studio is an Eclipse-based integrated
development environment (IDE) for Renesas MCUs). The example CommandSet can be used as a template for developing custom voice command sequences. The system’s reactions can then be monitored using a terminal window. It generally takes about 15 minutes to create the VUI structure shown in Figure 4. More sophisticated application software design for the Cyberon package is supported by the company’s Renesas Flexible Software Package (FSP) for embedded system designs using the RA families. The FSP is based on an open software ecosystem and includes Azure RTOS or FreeRTOS, legacy code, and third- party ecosystems. It can run in several IDEs, including e2 studio.
Background Noise
SNR
Distance
Hit-rate
Alexa Requirements
(Clean)
none
1.5 m
100.00%
90%
(Clean)
none
3 m
100.00%
90%
10 dB
Babble
1.5 m
98.55%
80%
10 dB
Babble
3 m
98.84%
80%
10 dB
Music
1.5 m
98.26%
80%
10 dB
Music
3 m
98.55%
80%
10 dB
TV
1.5 m
98.84%
80%
10 dB
TV
3 m
98.55%
80%
5 dB
Babble
1.5 m
98.84%
80%
5 dB
Babble
3 m
96.24%
80%
5 dB
Music
1.5 m
98.84%
80%
5 dB
Music
3 m
97.08%
80%
5 dB
TV
1.5 m
93.37%
80%
How well does the VUI perform?
5 dB
TV
3 m
90.72%
80%
It is one thing for a VUI to perform well in a quiet laboratory, but quite another for it to work accurately with significant background noise. A typical operating environment for a smart speaker could include a TV or radio, conversation, other music sources, and the general hubbub of a household or a social gathering. Moreover, the VUI will have to contend with dialects and less- than-perfect diction. Despite these challenges, users expect almost flawless performance.
Table 1: Command success test results for a Cyberon-powered VUI with various sources of background noise. In all cases, the VUI outperformed the Amazon Alexa benchmark. Image source: Renesas
To improve performance in a difficult listening environment, Cyberon DSpotter software running on the Renesas RA family of MCUs includes noise immunity features that require minimal processor resources. To demonstrate its efficacy, tests were done with a Cyberon DSpotter VUI listening to commands while subject to various background noise sources at 1.5-
and 3-meter (m) distances, and with signal-to-noise ratios (SNRs) of 0, 5, and 10 decibels (dB). In all cases, the VUI outperformed the Amazon Alexa benchmark (Table 1). Conclusion VUIs are rapidly becoming the preferred consumer control
interface for smart products. A speech control approach using phonemes as the basis of commands and a strict command structure can dramatically reduce memory and computing requirements, allowing the technology to run locally on small, resource-constrained MCUs.
It is one thing for a VUI to perform well in a quiet laboratory, but quite another for it to work accurately with significant background noise.
we get technical
8
9
Powered by FlippingBook