Unlock the potential of Edge AI and Machine Learning: learn to use FPGA SoCs for secure real-time systems, explore automation and blockchain in electronics manufacturing, implement cloud-free face recognition, and leverage Efinix FPGAs for AI/ML imaging.
We get technical
Edge AI and ML I Volume 10
How to use FPGA SoCs for secure and connected hard real-time systems How automation, machine learning, and Blockchain are driving the future of electronics manufacturing Quickly implement spoofing-resistant face recognition without a Cloud connection Why and how to use Efinix FPGAs for AI/ML imaging
we get technical
1
Editor’s note Edge AI and machine learning have transformed the way systems and devices process data, enabling real-time decision-making at the source rather than relying on remote servers. As engineers navigate the rapid evolution of technology, understanding and applying these concepts have become crucial for developing intelligent, autonomous systems across a range of industries, from manufacturing to healthcare. Edge AI refers to the deployment of artificial intelligence algorithms on Edge devices, which are located at the periphery of a network, close to the data source. These devices are equipped with enough processing power to handle AI computations locally. This eliminates the need to send large volumes of data to a centralized Cloud for analysis, drastically reducing latency, bandwidth usage, and energy consumption. In many industrial applications, where immediate responses are critical, these advantages make Edge AI an attractive option. Machine learning is the driving force behind the AI revolution, empowering systems to learn from data and improve their performance over time without explicit programming. When combined with Edge computing, machine learning enables the creation of systems that can operate autonomously, make decisions in real time, and adapt to changing conditions on the fly. This combination is already being applied in predictive maintenance, robotics, autonomous vehicles, and industrial automation. As the demand for intelligent, connected devices continues to rise, engineers must stay ahead of the curve by embracing Edge AI and machine learning. These technologies are no longer confined to large-scale enterprises with vast resources but are increasingly accessible to a wider range of industries and applications. By understanding how to implement and optimise Edge AI and machine learning, engineers can create more responsive, efficient, and secure systems that will drive the next wave of innovation.
4 AI development potential with the Agilex 5 system on module 8 How to run a ‘Hello World’ machine learning model on STM32 microcontrollers 14 How to use FPGA SoCs for secure and connected hard real-time systems 20 Special feature: retroelectro Programming a calculator to form concepts: the birth of artificial intelligence 28 How automation, machine learning, and Blockchain are driving the future of electronics manufacturing 34 Quickly implement spoofing- resistant face recognition without a Cloud connection 44 Why and how to use Efinix FPGAs for AI/ML imaging – Part 1: getting started 50 Why and how to use Efinix FPGAs for AI/ML imaging – Part 2: Image capture and processing 56 Powering the Edge: The Evolution of AI from Digital to Neuromorphic Systems for Ultra-Low Power Performance 60 All about AI/machine learning
For more information, please check out our website at www.digikey.com/automation.
we get technical
2
3
Artificial Intelligence (AI) is revolutionizing various industries by providing transformative solutions that significantly enhance efficiency, accuracy, and the ability to make informed decisions. In this landscape, the concept of Edge AI – processing AI algorithms on devices located at the Edge of a network – has emerged as a game- changing approach. It allows for real-time data processing, reduced latency, improved data privacy, and autonomy in decision-making, especially critical in sectors like healthcare, robotics, and industrial automation. iWave, a pioneering force in embedded systems engineering, is at the forefront of this revolution, offering embedded platforms designed to push the boundaries of AI at the Edge. These platforms are specifically tailored for applications requiring high-performance computing and sophisticated AI/ Figure 1. The iWave iW-RainboW-G58M SoM, powered by the Intel Agilex 5 FPGA which is the first FPGA to feature directly integrated AI capabilities. Image source: iWave
ML capabilities, such as media processing, robotics, and visual computing.
■ Group B: A5E 065B/052B/043B/ 028B/013B/008B SoC FPGA – These variants provide cost-
effective solutions for less demanding tasks, ensuring flexibility in design and implementation
Introducing iW-RainboW- G58M: the next generation of AI-infused FPGAs In a significant advancement for the embedded systems market, iWave is thrilled to introduce the iW-RainboW-G58M System on Module (SoM) (Figure 1), powered by the Intel Agilex 5 FPGA. This is the first FPGA to feature AI capabilities integrated directly into its fabric, marking a new era in FPGA technology. The iW-RainboW- G58M is meticulously engineered for applications demanding high-performance, low-latency processing, and custom logic implementation with embedded AI/ML support, making it an ideal choice for industries such as medical imaging, robotics, and industrial automation. The iW-RainboW-G58M SoM is compact, measuring just 60mm x 70mm, yet it is packed with powerful features. It supports the Intel Agilex 5 FPGA and SoC E-Series family in the B32A package, available in two distinct device variants to cater to a range of application needs: ■ Group A: A5E 065A/052A/043A/028A/013A SoC FPGA – These variants offer higher performance and are suitable for applications
The combination of these options allows developers to select the most appropriate FPGA variant for their specific application, balancing performance, power consumption, and cost. Harnessing the full potential of Intel Agilex 5 FPGAs for Edge AI Intel’s Agilex 5 FPGAs and SoCs represent a significant leap forward in FPGA technology, especially in the context of AI and machine learning applications at the Edge. The Agilex 5 series builds on Intel’s legacy of AI-optimized FPGAs, introducing the industry’s first AI tensor block in a mid-range FPGA. This block is specifically designed to accelerate AI workloads, making these FPGAs a perfect fit for edge AI applications where real-time processing and decision-making are critical. A key feature of the Agilex 5 FPGA is its asymmetric applications processor system, which includes dual Arm Cortex-A76 cores and dual Cortex-A55 cores. This configuration allows the FPGA to deliver exceptional processing power while optimizing power efficiency, a crucial factor in Edge
AI development potential with the Agilex 5 system on module
Written by: Tawfeeq Ahmad
requiring more complex processing capabilities
we get technical
4
5
AI development potential with the Agilex 5 system on module
CPUs, GPUs, and FPGAs. By using the OpenVINO toolkit, developers can ensure that their AI models are not only optimized for performance but are also highly portable across different hardware platforms, allowing for greater flexibility in deployment. Additionally, the Intel FPGA AI Suite plays a pivotal role in simplifying the development process. This suite is designed with ease of use in mind, enabling FPGA designers, machine learning engineers, and software developers to create AI platforms that are optimized for FPGA architectures. By integrating with industry-standard tools such as TensorFlow, PyTorch, and the OpenVINO toolkit, the Intel FPGA AI Suite allows developers to speed up the development process
while maintaining a high degree of reliability and performance in their AI solutions. The suite also integrates seamlessly with the Intel Quartus Prime FPGA design software, a powerful tool that supports the design, analysis, and optimization of FPGA-based systems. This integration ensures that developers have access to a robust and proven workflow, reducing time to market and enhancing the overall reliability of their AI applications.
computing environments where power consumption must be minimized without compromising performance. The Agilex 5 FPGA also includes enhanced Digital Signal Processing (DSP) capabilities, integrated with an AI tensor block. This combination allows the FPGA to handle complex AI tasks such as deep learning inference, image processing, and predictive analytics with greater efficiency and accuracy. Moreover, the FPGA’s advanced connectivity features, including high-speed GTS transceivers that support data rates up to 28.1 Gbps, PCI Express* (PCIe*) 4.0 × 8, and outputs for DisplayPort and HDMI, make it a versatile solution for a wide range of applications.
This is particularly important in applications such as autonomous vehicles, industrial automation, and healthcare, where delays in decision-making can have serious consequences. Moreover, Edge AI contributes to data privacy by keeping sensitive information on the local device, reducing the risk of data breaches associated with cloud-based processing. The hybrid approach, where edge devices perform initial data processing before transmitting it to the cloud for more complex analysis, is becoming increasingly popular. This method combines the strengths of both Edge AI and Cloud AI, allowing for efficient resource utilization, enhanced security, and improved system performance. Ensuring longevity and comprehensive support: iWave’s commitment to customers One of iWave’s key commitments is to ensure the long-term availability of its products. The company’s product longevity program guarantees that its System on Modules (SoMs) are available for extended periods, often exceeding 10 years. This is especially important for industries like medical devices, aerospace, and industrial automation, where product lifecycles are typically long, and consistent component
availability is critical.
Comprehensive AI/ML software ecosystem: accelerating development The iW-RainboW-G58M SoM is complemented by a comprehensive software ecosystem that significantly accelerates AI and machine learning development. Central to this ecosystem is the support for popular AI frameworks such as TensorFlow and PyTorch, ensuring that developers can leverage these familiar platforms to create sophisticated AI models without steep learning curves.
In addition to longevity, iWave provides extensive technical support throughout the product development process. This support includes ODM (Original Design Manufacturer) services, such as carrier card design, thermal simulation, and system-level design, allowing customers to focus on their core competencies while iWave handles the complex aspects of hardware design and integration. iWave’s commitment to customer success is further demonstrated by the provision of comprehensive evaluation kits for its SoMs. These kits come with complete user documentation, software drivers, and a board support package, enabling customers to rapidly evaluate and prototype their designs. By offering these resources, iWave helps customers reduce development time and bring their products to market faster. Summary iWave’s iW-RainboW-G58M SoM, with the Intel Agilex 5 FPGA that features integrated AI capabilities, is carefully engineered for high-performance, low-latency processing, and custom logic implementation with embedded AI/ML support applications. This makes it a good choice for industries such as medical imaging, robotics, and industrial automation.
Cloud AI vs. Edge AI: a comparative analysis
A critical component of this ecosystem is the OpenVINO
toolkit. This open-source toolkit is designed to optimize deep learning models for inference on a variety of hardware architectures, including
As AI continues to evolve, the distinction between Cloud AI and Edge AI becomes increasingly important. Cloud AI, which relies on the vast computational resources of remote data centers, offers high scalability and the ability to process large volumes of data. However, this approach often comes with higher latency and potential security concerns due to the need for data transmission over the internet. On the other hand, Edge AI offers significant advantages in scenarios where real-time processing, low latency, and enhanced data privacy are critical. By processing data locally on the device, Edge AI eliminates the need for constant communication with the cloud, reducing latency and improving the responsiveness of AI systems.
we get technical
6
7
How to run a ‘Hello World’ machine learning model on STM32 microcontrollers
Machine learning (ML) has been all the rage in server and mobile applications for years, but it has now migrated and become critical on Edge devices. Given that Edge devices need to be energy efficient, developers need to learn and understand how to deploy ML models to microcontroller-based systems. ML models running on a microcontroller are often referred to as tinyML. Unfortunately, deploying a model to a microcontroller is not a trivial endeavor. Still, it is getting easier, and developers without any specialized training will find that they can do so in a timely manner.
The next use case for tinyML that many embedded developers are interested in is image recognition. The microcontroller captures images from a camera, which are then fed into a pre-trained model. The model can discern what is in the image. For example, one might be able to determine if there is a cat, a dog, a fish, and so forth. A great example of how image recognition is used at the edge is in video doorbells. The video doorbell can often detect if a human is present at the door or whether a package has been left. One last use case with high popularity is using tinyML for predictive maintenance. Predictive maintenance uses ML to predict equipment states based on abnormality detection, classification algorithms, and predictive models. Again, plenty of applications are available, ranging from HVAC systems to factory floor equipment. While the above three use cases are currently popular for tinyML, there are undoubtedly many potential use cases that developers can find.
Introduction to tinyML use cases TinyML is a growing field that brings the power of ML to resource and power-constrained devices like microcontrollers, usually using deep neural networks. These microcontroller devices can then run the ML model and perform valuable work at the edge. There are several use cases where tinyML is now quite interesting. The first use case, which is seen in many mobile devices and home automation equipment, is keyword spotting. Keyword spotting allows the embedded device to use a microphone to capture speech and detect pretrained keywords. The tinyML model uses a time- series input that represents the speech and converts it to speech features, usually a spectrogram. The spectrogram contains frequency information over time. The spectrogram is then fed into a neural network trained to detect specific words, and the result is a probability that a particular word is detected. Figure 1 shows an example of what this process looks like.
Written by: Jacob Beningo
This article explores how
embedded developers can get started with ML using STMicroelectronics’ STM32 microcontrollers. To do so, it shows how to create a ‘Hello World’ application by converting a TensorFlow Lite for Microcontrollers model for use in STM32CubeIDE using X-CUBE-AI.
Figure 1. Keyword spotting is an interesting use case for tinyML. The input speech is converted to a spectrogram and then fed into a trained neural network to determine if a pretrained word is present. Image source: Arm
we get technical
8
9
How to run a ‘Hello World’ machine learning model on STM32 microcontrollers
curious readers, the trained model results versus actual sine wave results can be seen in Figure 2. The output of the model is in red. The sine wave output isn’t perfect, but it works well enough for a ‘Hello World’ program. Selecting a development board Before looking at how to convert the TensorFlow model to run on a microcontroller, a microcontroller needs to be selected for deployment in the model. This article will focus on STM32 microcontrollers because STMicroelectronics has many tinyML/ML tools that work well for converting and running models. In addition, STMicroelectronics has a wide variety of parts compatible with their ML tools (Figure 3). If one of these boards are lying around the office, it’s perfect for getting the ‘Hello World’ application up and running. However, for those interested in going beyond this example and getting into gesture control or keyword spotting, opt for the STM32 B-L4S5I-IOT01A
Discovery IoT Node (Figure 4).
Figure 2. A comparison between TensorFlow model predictions for a sine wave versus the actual values. Image source: Beningo Embedded Group
This board has an Arm Cortex-M4 processor based on the STM32L4+
data repositories are clearly a great place to start. If the required data hasn’t already been made publicly available on the Internet, then another option is for developers to generate their own data. Matlab
series. The processor has 2 megabytes (Mbytes) of flash
Here’s a quick list: ■ Gesture classification
memory and 640 kilobytes (Kbytes) of RAM, providing plenty of space for tinyML models. The module is adaptable for tinyML use case experiments because it also has STMicroelectronics’ MP34DT01 microelectromechanical systems
■ Anomaly detection ■ Analog meter reader ■ Guidance and control (GNC) ■ Package detection No matter the use case, the best way to start getting familiar with tinyML is with a ‘Hello World’ application, which helps developers learn and understand the basic process they will follow to get a minimal system up and running. There are five necessary steps to run a tinyML model on an STM32 microcontroller: 1. Capture data 2. Label data 3. Train the neural network 4. Convert the model 5. Run the model on the microcontroller
Colab’ button. Google Colab, short for Google Colaboratory, allows developers to write and execute Python in their browser with zero configuration and provides free access to Google GPUs. The output from walking through the training example will include two different model files; a model. tflite TensorFlow model that is quantized for microcontrollers and a model_no_quant.tflite model that is not quantized. The quantization indicates how the model activations and bias are stored numerically. The quantized version produces a smaller model that is more suited to a microcontroller. For those Figure 4. The STM32 B-L4S5I-IOT01A Discovery IoT Node is an adaptable experimentation platform for tinyML due to its onboard Arm Cortex-M4 processor, MEMS microphone, and three-axis accelerometer. Image source: STMicroelectronics
Capturing, labelling, and training a ‘Hello World’ model Developers generally have many options available for how they will capture and label the data needed to train their model. First, there are a lot of online training databases. Developers can search for data that someone has collected and labelled. For example, for basic image detection, there’s CIFAR-10 or ImageNet. To train a model to detect smiles in photos, there’s an image collection for that too. Online
or some other tool can be used to generate the datasets. If automatic data generation is not an option, it can be done manually. Finally, if this all seems too time-consuming, there are some datasets available for purchase, also on the Internet. Collecting the data is often the most exciting and interesting option, but it is also the most work. The ‘Hello World’ example being explored here shows how to train a model to generate a sine wave and deploy it to an STM32. The example was put together by Pete Warden and Daniel
(MEMS) microphone that can be used for keyword spotting application development. In
addition, the onboard LIS3MDLTR three-axis accelerometer, also from STMicroelectronics, can be used for tinyML-based gesture detection. Converting and running the TensorFlow Lite model using STM32Cube.AI Armed with a development board that can be used to run the tinyML model, developers can now start to convert the TensorFlow Lite model into something that can run on the microcontroller. The TensorFlow Lite model can run directly on the microcontroller, but it needs a runtime environment to process it.
Situnayake as part of their work at Google on TensorFlow Lite for Microcontrollers. This makes the job easier because they have put together a simple, public tutorial on capturing, labelling, and training the model. It can be found on Github here; once there, developers should click the ‘Run in Google
When the model is run, a series of functions need to be performed. These functions start with collecting the sensor data, then filtering it, extracting
Figure 5. How data flows from sensors to the runtime and then to the output in a tinyML application. Image source: Beningo Embedded Group
Figure 3. Shown are the microcontrollers and the microprocessor unit (MPU) currently supported by the STMicroelectronics AI ecosystem. Image source: STMicroelectronics
we get technical
10
11
How to run a ‘Hello World’ machine learning model on STM32 microcontrollers
there won’t be a huge difference, but it is noticeable. The project can then be generated by clicking ‘Generate code’. The code generator will initialize the project and build in the runtime environment for the tinyML model. However, by default, nothing is feeding the model. Developers need to add code to provide the model input values – x values – which the model will then interpret and use to generate the sine y values. A few pieces of code need to be added to the acquire_and_process_data and post_process functions, as shown in Figure 8. At this point, the example is now ready to run. Note: add some printf statements to get the model output for quick verification. A fast compile and deployment results in the ‘Hello World’ tinyML model running. Pulling the model output
■ As much fun as collecting data can be, it’s generally easier to purchase or use an open-source database to train the model Developers who follow these ‘tips and tricks’ will save quite a bit of time and grief when securing their application.
Figure 6. The X-CUBE-AI plug-in needs to be enabled using the application template for this example. Image source: Beningo Embedded Group
Figure 7. The analyze button will provide developers with RAM, ROM, and execution cycle information. Image source: Beningo Embedded Group
for a full cycle results in the sine wave shown in Figure 9. It’s not perfect, but it is excellent for a first tinyML application. From here, developers could tie the output to a pulse width modulator (PWM) and generate the sine wave. Tips and tricks for ML on embedded systems Developers looking to get started with ML on microcontroller- based systems will have quite a bit on their plate to get their first tinyML application up and running. However, there are several ‘tips and tricks’ to keep in mind that can simplify and speed up their development: ■ Walk through the TensorFlow Lite for microcontrollers ‘Hello World’ example, including the Google Colab file. Take some time to adjust parameters and understand how they affect the trained model ■ Use quantized models for microcontroller applications. The quantized model is compressed to work with uint8_t rather than 32-bit floating-point numbers. As a result, the model will be smaller
Conclusion
and execute faster ■ Explore the additional examples
ML has come to the network Edge, and resource-constrained microcontroller-based systems are a prime target. The latest tools allow ML models to be converted and optimized to run on real-time systems. As shown, getting a model up and running on an STM32 development board is relatively easy, despite the complexities involved. While the discussion examined a simple model that generates a sine wave, far more complex models like gesture detection and keyword spotting are possible.
in the TensorFlow Lite for Microcontrollers repository.
will give the developer the ability to select the model file they created and set the model parameters, as shown in Figure 7. An analyze button will also analyze the model and provide developers with RAM, ROM, and execution cycle information. It’s highly recommended that developers compare the Keras and TFLite model options. On the sine wave model example, which is small,
the necessary features, and feeding it to the model. The model will spit out a result which can then be further filtered, and then – usually – some action is taken. Figure 5 provides an overview of what this process looks like.
Other examples include gesture detection and keyword detection ■ Take the ‘Hello World’ example by connecting the model output to a PWM and a low-pass filter to see the resultant sine wave. Experiment with the runtime to increase and decrease the sine wave frequency ■ Select a development board that includes ‘extra’ sensors that will allow for a wide range of ML applications to be tried
The X-CUBE-AI plug-in to STM32CubeMx provides the
runtime environment to interpret the TensorFlow Lite model and offers alternative runtimes and conversion tools that developers can leverage. The X-CUBE-AI plug- in is not enabled by default in a project. However, after creating a new project and initializing the board, under Software Packs-> Select Components, there is an option to enable the AI runtime. There are several options here; make sure that the Application template is used for this example, as shown in Figure 6. Once X-CUBE-AI is enabled, an STMicroelectronics X-CUBE- AI category will appear in the toolchain. Clicking on the category
Figure 9. The ‘Hello World’ sine wave model output when running on the STM32. Image source: Beningo Embedded Group
Figure 7. The analyze button will provide developers with RAM, ROM, and execution cycle information. Image source: Beningo Embedded Group
we get technical
12
13
Figure 1. All the elements in this FPGA SoC, including the RISC-V subsystems, are implemented on the FPGA fabric. Image source: Microchip Technology
How to use FPGA SoCs for secure and connected hard real-time systems
communications interfaces, as well as global navigation satellite system (GNSS) location capability.
system elements. Designers need to include a memory management unit, memory protection unit, secure boot capability, and gigabit-class transceivers for high-speed connectivity. The design will need active and static power management and control of inrush currents. Some designs will require operation over the extended commercial temperature range of 0°C to +100°C junction temperature (TJ), while systems in industrial environments will need to operate with TJ from -40°C to +100°C. To address these and other challenges, designers can turn to FPGA system-on-chip (SoC) devices that combine low power consumption, thermal efficiency, and defense-grade security
for smart, connected, and deterministic systems.
This article reviews the architecture of such an FPGA SoC and how it supports the efficient design of connected and deterministic systems. It then briefly presents the EEMBC CoreMark-Pro processing power versus power consumption benchmark, along with a view of the benchmark performance of a representative FPGA SoC. It looks at how security is baked into these FPGA SoCs and details exemplary FPGA SoCs from Microchip Technology, along with a development platform to accelerate the design process. It closes with a brief listing of expansion boards from MikroElektronika that can be used to implement a range of
SoCs built with an FPGA fabric The ‘chip’ for this SoC is an FPGA fabric that contains the system elements, from the FPGA to the RISC-V MCU subsystem that’s built with hardened FPGA logic. The MCU subsystem includes a quad- core RISC-V MCU cluster, a RISC-V monitor core, a system controller, and a deterministic Level 2 (L2) memory subsystem. The FPGA in these SoCs includes up to 460 K logic elements, up to 12.7 gigabit per second (Gbps) transceivers, and other input/output (I/O) blocks, including general purpose I/O
Field programmable gate arrays (FPGAs), Linux-capable RISC-V microcontroller unit (MCU) subsystems, advanced memory architectures, and high- performance communications interfaces are important tools for designers. This is particularly true for designers of secure connected systems, safety-critical systems, and a wide range of hard real-time
deterministic systems like artificial intelligence (AI) and machine learning (ML). However, the integration of those diverse elements into a secure, connected, and deterministic system can be a challenging and time-consuming activity, as is laying out the high-speed interconnects for the various
Written by: Jeff Shepard
we get technical
14 14
15
How to use FPGA SoCs for secure and connected hard real-time systems
workloads include a linear algebra routine derived from LINPACK, a fast Fourier transform, a neural net algorithm for pattern evaluation, and an improved version of the Livermore loops benchmark. JPEG compression, an XML parser, ZIP compression, and a 256-bit secure hash algorithm (SHA-256) form the basis of the integer workloads. The MPFSO95T models of these SoC FPGAs, like the MPFS095TL- FCSG536E, can deliver up to 6,500 Coremarks at 1.3 watts (Figure 3).
and immunity from Meltdown and Spectre attacks. Security begins with secure supply chain management, including the use of hardware security modules (HSMs) during wafer testing and packaging. The use of a 768- byte digitally signed x.509 FPGA certificate embedded in every FPGA SoC adds to supply chain assurance. Numerous on-chip tamper detectors are included in these FPGA SoCs to ensure secure and reliable operation. If tampering is detected, a tamper flag is issued that enables the system to respond as needed. Some of the available tamper detectors include: ■ Voltage monitors ■ Temperature sensors ■ Clock glitch and clock frequency detectors
■ Configure L1 and L2 as deterministic memory ■ DDR4 memory subsystem ■ Disable/enable branch predictors ■ In-order pipeline operation More processing with less energy In addition to their system operation benefits, including support for hard, real-time processing, these FPGA SoCs are highly energy efficient. The EEMBC CoreMark-PRO benchmark is an industry standard for comparing the efficiency and performance of MCUs in embedded systems. It was designed specifically to benchmark hardware performance and to replace the Dhrystone benchmark. The CoreMark-PRO workloads include a diversity of performance characteristics, instruction-level parallelism, and memory utilization based on four floating-point workloads and five common integer workloads. The floating-point
Figure 2. The RISC-V subsystem includes several processor and memory elements. Image source: Microchip Technology
The RISC-V MCU subsystem uses a five-stage single-issue, in-order pipeline. It’s not vulnerable to Spectre or Meltdown exploits that can afflict out-of-order architectures. All five MCUs are coherent with the memory subsystem, supporting a mix of deterministic asymmetric multi- processing (AMP) mode real-time systems and Linux. Capabilities of the RISC-V subsystem include (Figure 2): ■ Run Linux and hard real-time operations
(GPIO) and Peripheral Component Interconnect Express (PCIe) 2. The overall architecture is designed for reliability. It includes single- error correction and double- error detection (SECDED) on all memories, differential power analysis (DPA), physical memory protection, and 128 kilobits (Kbits) of flash boot memory (Figure 1). Microchip offers its Mi-V (pronounced ‘my five’) ecosystem of third-party tools and design resources to support the implementation of RISC-V systems. It’s built to speed the adoption of the RISC-V instruction set architecture (ISA) for hardened RISC-V cores and for RISC-V soft cores. Elements of the Mi-V ecosystem include access to: ■ Intellectual property (IP) licenses ■ Hardware ■ Operating systems and middleware ■ Debuggers, compilers, and design services
the FPGA SoC include several debugging capabilities like passive run-time configurable advanced extensible interface (AXI) and instruction trace. AXI enables designers to monitor data that’s being written to or read from various memories and to know when it’s being written or read.
Security considerations
The safety-critical and hard real-time applications for these FPGA SoCs require strong security in addition to high energy efficiency and powerful processing capabilities. The basic security functions of these FPGA SoCs include differential power analysis (DPA) resistant bitstream programming, a true random number generator (TRNG), and a physically unclonable function
■ JTAG active detector ■ Mesh active detector
Security is further ensured with 256-bit advanced encryption standard (AES-256) symmetric block cipher correlation power attack (CPA) countermeasures, integrated cryptographic digest capabilities to ensure data integrity, integrated PUF for key storage, and zeroization capabilities for the FPGA fabric and all on-chip memories.
(PUF). They also include standard and user-defined secure boot, physical memory protection that provides memory access restrictions related to the machine’s privilege state, including machine, supervisor, or user modes,
Figure 4. The automotive temperature MPFS250T- 1FCSG536T2 comes in a 16 x 16mm package with a ball count of 536 and a 0.5mm pitch. Image source: Microchip Technology
Figure 3. The MPFS095T FPGA SoC (orange line) delivers 6500 Coremarks at 1.3 watts. Image source: Microchip Technology
The hardened RISC-V MCUs in
we get technical
16
17
How to use FPGA SoCs for secure and connected hard real-time systems
with 3GPP IoT devices.
Designers can choose from standard speed grade devices, or -1 speed grade devices that are 15% faster. These FPGA SoCs can be operated at 1.0 volt for lowest power operation, or at 1.05 volts for higher performance. They are available in a range of package sizes, including 11 x 11 millimeters (mm), 16 x 16 mm, and 19 x 19 mm. For applications that need extended commercial temperature operation, standard speed operation, and 254 K logic elements in a 19 x 19mm package, designers can use the MPFS250T-FCVG484EES. For simpler solutions that need 23
K logic elements, designers can turn to the MPFS025T-FCVG484E, also with extended commercial temperature operation and standard speed grade in a 19 x 19 mm package. The MPFS250T- 1FCSG536T2 with 254 K logic elements is designed for high- performance automotive systems and has an operating temperature range of -40 to 125°C and a -1 speed grade for a 15% faster clock, in a compact 16 x 16mm package with 536 balls on a 0.5mm pitch (Figure 4).
FPGA SoC examples Microchip Technology combines these capabilities and technologies into its PolarFire FPGA SoCs with multiple speed grades, temperature ratings, and various package sizes to support designers’ needs for a wide range of solutions with between 25 K and 460 K logic elements. Four temperature grades are available (all rated for TJ), 0°C to +100°C extended commercial range, -40°C to +100°C industrial range, -40°C to +125°C automotive range, and -55°C to +125°C military range.
MIKROE-2670, enables GNSS functionality with concurrent reception of GPS and Galileo
constellations plus either BeiDou or GLONASS, resulting in high position accuracy in situations with weak signals or interference in urban canyons.
Conclusion
Designers can turn to FPGA SoCs when developing connected, safety-critical and hard real-time deterministic systems. FPGA SoCs provide a wide range of system elements, including an FPGA fabric, RISC-V MCU subsystem with high-performance memories, high-speed communications interfaces, and numerous security functions. To help designers get started, development boards and environments are available that include all the necessary elements, including expansion boards that can be used to implement a wide range of communications and location functions. Recommended reading 1. How to Implement Time Sensitive Networking to Ensure Deterministic Communication 2. Real-Time Operating Systems (RTOS) and Their Applications
environment, including a multi-rail power sensor system to monitor the various power domains, PCIe root port, and on-board memories – including LPDDR4, QSPI, and eMMC Flash – to run Linux and Raspberry
FPGA SoC dev platform To speed the design of systems with the PolarFire FPGA SoC, Microchip offers the MPFS-ICICLE- KIT-ES PolarFire SoC Icicle kit that enables exploration of the five-core Linux-capable RISC-V microprocessor subsystem with low-power, real-time execution. The kit includes a free Libero Silver license that’s needed to evaluate designs. It supports programming and debugging features in a single language. These FPGA SoCs are supported with the VectorBlox accelerator software development kit (SDK) that enables low-power, small- form-factor AI/ML applications. The emphasis is on simplifying the design process to the point that designers don’t need to have prior FPGA design experience. The VectorBlox accelerator SDK enables developers to program power-efficient neural networks using C/C++. The Icicle kit has numerous features to provide a comprehensive development
Pi, and mikroBUS expansion ports for a host of wired and
Figure 5. This comprehensive FPGA SoC development environment includes connectors for Raspberry Pi (top right) and mikroBUS (lower right side) expansion boards. Image source: Microchip Technology
wireless connectivity options, plus functional extensions like GNSS location capability (Figure 5).
Expansion boards
A few examples of mikroBUS expansion boards include: MIKROE-986, for adding CAN bus connectivity using a serial peripheral interface (SPI). MIKROE-1582 , for interfacing between the MCU and an RS-232 bus. MIKROE-989, for connecting with an RS422/485 communication bus. MIKROE-3144 , supports the LTE Cat M1 and NB1 technologies enabling reliable and simple connectivity
we get technical
18
19
retroelectro
Retro Electro: Programming a calculator to form concepts: the birth of artificial intelligence
Figure 1. Some attendees of the Summer Research Project. Back row, from left to right, Oliver Selfridge, Nathaniel Rochester,
Marvin Minsky, and John McCarthy. In front, Ray
Solomonoff, Peter Milner, and Claude Shannon.
Written by: David Ray, Cyber City Circuits
The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can, in principle, be so precisely described that a machine can be made to simulate it.
study on artificial intelligence. The aim was to gather many of the nation’s top scientists, engineers, and mathematicians in the same room together to focus on what artificial intelligence could mean and how they could get there. They requested $13,500 to complete this study, but the Rockefeller Foundation only provided $7500 for a five-week study instead of two months. The group of four organizers were all highly distinguished researches and inventors. They were developing the fundamentals for today’s generative AI, nearly seventy years ago. The proposal outlines seven distinct parts of the problem.
an automatic calculator can be programmed to simulate the machine.” The idea was simple: if a machine could do a job, a computer could be
A proposal for The Dartmouth Summer Research Project on artificial intelligence
Since the earliest days of ‘computers’, it has taken thousands of people to bring ‘artificial intelligence’ and machine learning to where it is today. In the Summer of 1955, Dr. John McCarthy started a new position as an assistant professor of Mathematics at Dartmouth College. Historians say McCarthy was the first to use the term ‘Artificial Intelligence’ in this proposal to the Rockefeller Foundation. Proposed and Organized by John McCarthy of Dartmouth College, Marvin Minsky of Harvard University, Nathaniel Rochester of IBM, and Claude E. Shannon of Bell Labs. The proposal was for a two-month, ten-man
Automatic computers
Figure 2. Personal invitation to Dartmouth from McCarthy to Ray Solomonoff.
“If a machine can do a job, then
we get technical
20
21
retroelectro
adapt to increasingly complex environments, Shannon hopes to build models replicating this adaptability in ‘automata’, ultimately advancing our understanding of mechanized intelligence. Proposal for research by M. L. Minsky As a graduate student, Marvin Minsky developed the first ‘neural network’ (The ‘Stochastic Neural Analog Reinforcement Calculator’ or ‘SNARC’) at Bell Labs in the early 1950s. A Navy Veteran, he had degrees from Harvard and Princeton. He founded MIT’s Artificial Intelligence Lab and generally stayed there from its inception in 1963 until he died in 2016. Minsky’s proposal focused on designing a machine capable of learning through sensory and ‘motor abstractions’. Minsky
grammar, and syntax, any thinking machine would likely need to operate in a similar way, governed by whitespace and syntax. Neuron nets “How can a set of (hypothetical) neurons be arranged so as to form concepts.” As scientists began to grapple with the challenge of mimicking
which may best be described as self-improvement.” The vision of creating a truly intelligent machine led to a fascinating concept: self- improvement. Researchers speculated that for a machine to be intelligent, it would need the ability to enhance its own capabilities over time. Abstraction “A number of types of ‘abstraction’ can be distinctly defined and several others less distinctly. A direct attempt to classify these and to describe machine methods of forming abstractions from sensory and other data would seem worthwhile.” Abstraction, the ability to distill complex information into simpler concepts, was identified as a key process in human thought. To replicate this in machines, scientists needed to classify and define different types of abstraction. This task was seen as essential for enabling machines to interpret sensory data and other information in a human-like manner. Randomness and creativity “A fairly attractive and yet clearly incomplete conjecture is that the difference between creative thinking and unimaginative competent thinking lies in the injection of some randomness.”
“We will concentrate on a problem of devising a way of programming a calculator to form concepts and to form generalizations. This, of course, is subject to change when the group gets together.”
Figure 2. Claude Shannon with his self-solving ‘mouse-in-a-maze’ machine, Theseus.
nature of creativity, they considered the role of randomness in the creative process. The intriguing idea emerged that the difference between routine and creative thinking might lie in the controlled injection of randomness. This theory suggested that when guided by intuition, randomness could be the secret ingredient that makes creative thinking possible. Proposal for research by C. E. Shannon Claude Shannon’s master’s thesis, A Symbolic Analysis of Relay and Switching Circuits, is credited with introducing Boolean logic to electronic circuits and creating the digital age. After completing his doctorate at MIT, Shannon worked at Bell Labs, where he colaborated with and mentored McCarthy and Minsky in 1951 and 1952. Together, they developed ‘Theseus’, a self- solving ‘mouse in a maze’ using relay logic. Shannon’s research proposal for the Summer Research Project delved into two key areas related to information theory and brain models:
Application of information theory to computing machines and brain models Shannon’s first research focus addresses the challenge of reliably transmitting information across noisy channels using unreliable components. He explores how information flows in parallel data streams over closed-loop networks and examines the complications that may arise, such as propagation delays and redundancy. Shannon proposes investigating new approaches to minimize these delays, ensuring reliable transmission of information across complex systems. The matched environment and brain model approach to Automata In the second topic, Shannon theorizes that both animal and human brain development occurs in stages, beginning with simpler environments and eventually moving toward more complex ones. As someone gets older, the more their brain can comprehend the universe around them. He wanted to explore the specific stages of brain development and express them mathematically. By understanding how brains
human thought, they turned to the brain’s fundamental building blocks: neurons. The question was how to arrange a set of hypothetical neurons to form concepts. Pioneers in the field had made strides in both theoretical and experimental work, but the problem remained far from solved. Theory of the size of a calculation “If we are given a well-defined problem, one way of solving it is to try all possible answers in order.” In their quest to solve complex problems, early computer scientists realized that brute-force methods were too time consuming. To address this, they sought to understand and measure how efficient a calculation could be. Self-improvement “Probably a truly intelligent machine will carry out activities
programmed to replicate that task. Here, they admit that the speed and memory sizes of the machines they had at the time were ‘insufficient’ to simulate higher brain function. An issue they felt they could tackle is that there was no programming language available to do such a thing in the first place. How can a computer be programmed to use a language “It may be speculated that a large part of human thought consists of manipulating words according to rules of reasoning and rules of conjecture... This idea has never been very precisely formulated nor have examples been worked out.” Up to this point, the closest thing available for programming was assembly language. Here, the thought was that since much of thinking is really made up of words,
As researchers delved into the
Figure 3. Marvin Minsky at Piano’
we get technical
22
23
retroelectro
describes a machine that can be trained via a ‘trial and error’ process to perform specific tasks within an environment and exhibit ‘goal-seeking’ behavior. This hypothetical machine could process inputs, generate outputs, and adapt to success or failure by reading sensors and such, similar to Shannon’s Theseus project, for which Minsky designed the SNARC. Minsky emphasizes the importance of pairing sensory and motor controls for the machine to affect and learn from its environment effectively. Progress in the machine’s learning would depend on its ability to relate environmental changes to corresponding changes in its sensor readings. Minsky further explains that the machine should develop an internal abstract model of its environment, stored in memory. This internal ‘abstract’ model would allow it to first experiment internally before conducting external tests, enabling it to perform tasks more intelligently. The machine’s behavior would appear imaginative because it could
predict and anticipate changes in the environment based on its motor actions. Proposal for research by N. Rochester Nathaniel Rochester worked at IBM at the time. He graduated from MIT in 1941 and then worked developing RADAR systems for the US Navy during the war. He started at IBM in 1948 after the wartime development dried up. A few years later, IBM released the first in the 700 series of electronic computers, the IBM 701, for which Rochester was the lead developer. At the time of the proposal, Rochester was the head of a research group studying information theory and automatic pattern recognition. McCarthy and Rochester first met when IBM gifted an IBM 704 to MIT’s research lab, specifically for researching ‘neural networks.’ Rochester’s research proposal centers on the challenge of creating a machine capable of exhibiting originality in its problem-solving
and predict outcomes in the environment. He proposes that machines could similarly be designed to form abstractions of sensory data, define problems, and then simulate possible solutions, evaluating their success before acting. While this approach works for well-understood problems, Rochester notes that solving new or long-unsolved problems requires randomness and creativity. He argues that randomness could be key to overcoming the limitations of pre-programmed rules and enabling machines to behave in original ways, much like how scientists may rely on a ‘hunch’ to approach difficult problems. Rochester discusses the Monte Carlo method, which involves conducting hundreds or thousands of random experiments to approximate solutions to complex problems. He sees potential in applying this method to machine learning, suggesting that machines could explore many possibilities simultaneously and uncover solutions that traditional methods might miss.
simulating human- like randomness
in machines is challenging, as
the brain’s control mechanisms differ significantly from those of calculators and computers. Proposal for research by John McCarthy
Figure 5. John McCarthy while working with chess computers.
John McCarthy, an army veteran, is famously known for coining the term ‘Artificial Intelligence.’ Following his doctorate at Princeton, he took a few assistant professor positions in the area, landing at Dartmouth College in the summer of 1955. As a graduate student, he interned with Marvin Minsky at Bell Labs, where he was mentored by Claude Shannon. Following the Summer Research Project, however, he took a position at MIT with Marvin Minsky, continuing work in AI and developing the LISP programming language. McCarthy’s proposal focuses on studying the relationship between language and intelligence. It
argues that direct applications of trial-and-error methods to the interaction between sensory data and motor activity are unlikely to result in complex behaviors. Instead, he advocates for applying trial and error at a higher level of abstraction. He highlights language as a crucial tool people use to handle intricate phenomena, noting that human minds use language to formulate conjectures and test them. McCarthy points out that English has several advantageous properties for facilitating complex thought processes, properties that programming languages developed for computers often lack. These properties include the ability to use concise arguments that can be supplemented by informal mathematics, a way of incorporating other languages within English, and the ability for users to reference their own problem-solving progress. He also
Figure 4. Nathaniel Rochester designed the first electronic IBM computer.
programmed with a fixed set of rules to address specific
contingencies and failures, leaving them without the flexibility to act intuitively or with common sense. For example, in your calculator, if you divide by ‘0’, then you will likely get an error or some sort, but this is because the calculator was programmed to give an error when asked to divide by ‘0’ instead of it learning on its own that dividing by ‘0’ doesn’t work and developing its own rules. Rochester highlights the frustration from when machines fail due to rigid or contradictory rules and suggests that a more sophisticated approach is needed to enable machines to behave intelligently. Rochester draws on Kenneth Craik’s model of human thought, which theorizes that the brain constructs ‘engines’ that simulate
abilities. Typically, machines like automatic calculators are
“Unless the machine is provided with, or is able to develop, a way of abstracting sensory material, it can progress through a complicated environment only through painfully slow steps, and in general will not reach a high level of behavior.” - Minsky
However, he acknowledges that
“So the mathematician has the machine making a few thousand random experiments … the results of these experiments provide a rough guess as to what the answer may be.” – Rochester
we get technical
24
25
Page 1 Page 2-3 Page 4-5 Page 6-7 Page 8-9 Page 10-11 Page 12-13 Page 14-15 Page 16-17 Page 18-19 Page 20-21 Page 22-23 Page 24-25 Page 26-27 Page 28-29 Page 30-31 Page 32-33 Page 34-35 Page 36-37 Page 38-39 Page 40-41 Page 42-43 Page 44-45 Page 46-47 Page 48-49 Page 50-51 Page 52-53 Page 54-55 Page 56-57 Page 58-59 Page 60-61 Page 62-63 Page 64-65 Page 66Powered by FlippingBook