Breaking down the verification and validation of AI in safety-critical applications

Author : Lucas Garcia, MathWorks

20 March 2024

Engineers designing AI-enabled systems are being confronted with an evolving set of global regulatory standards.

Hot off the heels of the White House issuing an executive order on AI regulation, which mandates AI companies to report and test specific models to ensure that AI systems meet specified requirements, the UK established in November 2023 the AI Safety Institute, the first state-backed organisation focused on advanced AI safety for the public interest.

Both of these measures highlight the importance of robust verification and validation (V&V) processes for AI-enabled systems. AI regulations and V&V processes will significantly impact safety-critical systems. AI is increasingly used for system design, including in safety-critical applications such as automotive and aerospace industries.

What is verification and validation in AI-enabled systems?

In short, verification determines whether an AI model is designed and developed in line with the specified requirements, whereas validation involves checking whether the product has met the client's needs and expectations. Employing V&V techniques has multiple benefits. First, engineers can ensure that any AI model’s outputs meet outlined specifications, allowing for early bug detection and mitigation of data bias.

A second benefit is that by performing V&V in safety-critical scenarios, it is ensured that an AI-enabled safety-critical system can maintain its performance level under various circumstances. This is because when AI is used in safety-critical systems, the models can approximate physical systems and validate the design. Engineers simulate entire AI-enabled systems and use the data to test systems in different scenarios, including outlier events.

Applying V&V also makes it easier for AI-enhanced products to comply with standards before going to market. These certification processes ensure that specific elements are built into these products. Engineers perform V&V to test the functionality of these elements, which makes it easier to obtain certifications.

Example use cases

V&V techniques are already being applied in multiple industries. In the automotive industry, the ISO/CD PAS 8800 is a standard being developed to address safety-related properties and risk factors for road vehicles. In aerospace and defence, where certification is mandatory, existing standards such as the Software Considerations in Airborne Systems and Equipment Certification (DO178C) cannot always directly address the unique challenges posed by AI. For this reason, the new ARP6983 process standard is being created to provide guidelines for developing and certifying aeronautical safety-related products implementing AI.

In both aviation and automotive, Deep Learning Toolbox Verification Library and MATLAB Test can help engineers stay at the forefront of V&V in these industries, by developing software that helps to adhere to industry standards, streamlining the verification and testing of AI models within larger systems.

V&V AI processes in safety-critical systems: step-by-step

When performing V&V, the engineer needs to meet two key objectives. That is, to ensure that the AI component meets the specified requirements, and is reliable under all operating conditions, and, therefore, is safe and ready for deployment.

The V&V process for AI involves performing software assurance activities that include a combination of static and dynamic analyses, testing, formal methods, and real-world operational monitoring.

V&V processes may vary slightly across industries, but the principal steps are:

• Analysing the decision-making process to solve the ‘Black Box’ problem

• Testing the model against representative datasets

• Conducting AI system simulations

• Ensuring the model operates within acceptable bounds

The steps in the V&V process described below are iterative, allowing for continuous refinement and improvement of the AI system as engineers collect new data, gain new insights, and integrate operational feedback.

Analysing the decision-making process for transparency

The ‘Black Box’ problem is often encountered by engineers using AI models to add automation to a system. For engineers and scientists to trust model-based predictions and comprehend decision-making, they must understand how AI-based systems make decisions. These are typically done via the following two processes:

Feature Importance Analysis

Feature Importance Analysis is a technique that helps engineers identify which input variables impact a model’s predictions most significantly. Although the analysis works differently for different models, such as tree-based and linear models, the general procedure assigns a feature importance score to each input variable, with a higher score signifying a greater impact on the model’s decision. In a safety-critical system in the automotive industry, variables may include environmental factors, such as precipitation or the presence and behaviour of other vehicles.

Explainability

Explainability techniques offer insights into the model’s behaviour. This is particularly relevant when the black-box nature of the model prevents the use of other approaches. In the context of images, these techniques identify the regions of an image that have the biggest contribution to the final prediction. This enables engineers to understand the model’s primary focus when making a prediction.

Using representative datasets for model testing

Engineers often evaluate an AI model’s performance in real-world scenarios where the safety-critical system is expected to operate, with the objective of identifying limitations and improving the accuracy and reliability of the model. Test cases are designed to evaluate various aspects of the model, such as its accuracy and ability to replicate. Finally, the model is applied to the datasets, and the results are recorded and compared to the expected output. The model design is improved according to the outcome of the data testing.

Simulating AI Systems

Simulating an AI-enabled system enables engineers to evaluate and assess the system’s performance in a controlled environment. During a simulation, a virtual environment is created that mimics a real-world system under a variety of conditions. Engineers first define the inputs and parameters to simulate a system, and the simulation is then executed using software such as Simulink, which produces the system’s responses to the proposed scenario. As with data testing, the simulation results are compared to expected or known outcomes, and the model is improved iteratively.

Fortifying model boundaries

For AI models to operate safely and reliably, it is vital to establish limits and monitor the model’s behaviour, to ensure that it stays within those boundaries. One of the most common boundary issues is when a model has been trained on a limited dataset and encounters out-of-distribution data at runtime. Problems also arise when the model may not be robust enough and can potentially lead to unpredictable behaviour.

To negate these issues, engineers employ bias mitigation and robustification methods to ensure that AI models operate within acceptable bounds:

Data augmentation and balancing

One way to mitigate data bias is to create variability in the data used to train the AI model, which reduces a model’s dependence on repeating patterns. The data augmentation technique helps ensure fairness and equal treatment of different classes and demographics. For instance, in a self-driving car, data augmentation may involve using pictures of pedestrians from different angles to help the model detect their presence regardless of their positioning.

The data balancing technique is often paired with data augmentation, and includes similar samples from each data class. Using the pedestrian example, balancing the data means ensuring that the dataset contains a proportionate number of images for each variation of pedestrian scenarios, such as different body shapes, clothing styles, lighting conditions, and backgrounds. This technique minimises bias and improves the model's ability to translate across diverse real-world situations.

Robustness

Robustness is a primary concern when deploying neural networks in safety-critical situations. Neural networks are susceptible to misclassification due to small, difficult-to-detect changes that pose significant risks. These disturbances can cause a neural network to output incorrect or dangerous results, which is alarming in systems where errors can lead to catastrophes.

One solution is integrating formal methods into the development and validation process. Formal methods involve using rigorous mathematical models to establish and prove the correctness properties of neural networks. By applying these methods, engineers can improve the network’s resilience to certain types of disturbances, ensuring higher robustness and reliability in safety-critical applications.

Conclusion

The growing integration of AI into safety-critical systems means that V&V processes will become crucial to complying with regulation, legal requirements and obtaining industry certification. Developing and maintaining trustworthy systems requires engineers to utilise verification techniques that provide explainability and transparency for the AI models that run those systems.

As for engineers’ adoption of AI into their V&V processes, it is essential to explore a variety of testing approaches that address the increasingly complex challenges of AI technologies. Ultimately, these efforts ensure AI is used responsibly and transparently.

Contact Details and Archive...

MathWorks Limited

Print this page | E-mail this page

Your name :	Required!
Recipient name :	Required!
Recipient e-mail :	Required!Invalid!

Enter the code shown :	Invalid!

Breaking down the verification and validation of AI in safety-critical applications

Engineers designing AI-enabled systems are being confronted with an evolving set of global regulatory standards.

Contact Details and Archive...

E-mail this page

Related Articles

Computer vision equips surveillance software with ‘privacy protection’

Computer vision equips surveillance software with ‘privacy protection’

Hexagon connects shopfloor processes with real-time information

Hexagon connects shopfloor processes with real-time information

Effective software management of battery energy storage systems

Effective software management of battery energy storage systems

Three months to save democracy? New report reveals risks AI poses to elections

Three months to save democracy? New report reveals risks AI poses to elections

Most Popular

Computer vision equips surveillance software with ‘privacy protection’

Setting the standard: MTP technology and its impact on life sciences production

Three layers of predictive maintenance

Schneider Electric appoints Rhonda Doyle as VP of Customer Operations UK and Ireland

SolutionsPT sets the standard for cyber-secure OT systems with annual Cyber SecureOT conference

Videos

ABB Robotics outlines robotics benefits as it opens Auburn Hills facility in Michigan

Machine vision technology on show at Routeco Live 2024

Business benefits and outcomes of a successful MES implementation

Revolutionising automation to streamline product distribution

Follow us on Social Media

Contact Us

Terms & Conditions