Department of Computer Science

University of Wales Aberystwyth

SoftFMEA Project – Final Report

Background

Previous research at Aberystwyth had resulted in the development and commercialisation of an innovative design analysis tool (known as AutoSteve) that uses qualitative simulation to automatically generate a textual Failure Mode Effects Analysis (FMEA) report for electrical systems in the automotive industry.

 

The increasing sophistication in the requirements for automotive electrical systems is being supported by the extensive use of software.  A good example of this is in the use of data buses (such as the Controller Area Network or CAN) to transmit control messages and sensor data between different parts of the system.  As the qualitative simulator underlying AutoSteve was specific to electrical systems, this introduced a requirement for introducing some other suitable simulation for software within an electrical system and combining that simulation with the electrical simulation.

 

Project objectives

The aim of the project was to devise an overall failure analysis method for mixed hardware / software systems. A secondary aim was to produce improved software that was able to mitigate the effects of component failures. These were to be achieved by the following tasks:-

  1. Identification of case studies.
  2. Investigation of tools for simulation of software systems.
  3. Integration of different kinds of simulation with interaction between the circuit simulator and software simulation.
  4. Abstraction of principles for integration of different kinds of simulation.
  5. Identification of redundancy and possible fault mitigating strategies.
  6. A preliminary investigation of the feasibility of the generation of software for fault mitigation in automotive systems.

 

Key advances and supporting methodologies

The project had broadly two parts.  The major work was aimed at solving problems associated with performing automated FMEA for new generation automotive systems that benefit from the introduction of microprocessors, software and bus networks.  The following sections document the technical advances the project has made that allow this to happen.  The more speculative part of the project was to investigate the potential for automated generation of fault mitigating strategies and this proved difficult because of the nature and variety of systems present and the lack of mature mitigating solutions in the sector.   This element of the project was however valuable in an unexpected way, because it drove extensions to the functional modelling language that allow an FMEA analysis to be performed on systems that contain fault mitigating or redundant behaviour and this level of sophistication was not previously possible.

 

Failure analysis of electrical systems that incorporate networked components

 

Various approaches to the modelling of software components were examined, involving the modelling of the software’s behaviour using a state chart. State charts are widely used for the modelling of reactive systems, and as such are an appropriate approach to the modelling of software based components, being capable of capturing the necessary behaviour and also being a familiar tool in the industry. Several languages that use state charts were examined and there is enough similarity between them that there would be little difficulty in using any one of these languages for the modelling of software components, in conjunction with the electrical simulator. However, as AutoSteve already has the facility for using state charts to model component behaviour, this was used as a basis for specifying the changes necessary to allow the modelling of the passing of messages between components using a network. The most important limitation of the AutoSteve state chart facility, from the point of view of simulation of systems that make use of a computer network, is the assumption that the state chart models of component behaviour are self contained and the interaction between components is simulated in terms of changes in the flow of electric current around the circuit. This is not an appropriate approach for modelling software-based interaction (such as network messages), being too detailed. If the network transmission is electrical (rather than optical fibre, say) then it would in principle be possible to use the electrical simulator to model the message, but this would mean running the simulator at the level of each bit of the message. This is not feasible as it would be extremely slow and no useful additional information would be gained by such detailed modelling of message passing. An abstraction of this modelling was required, which allows the networked components to communicate at a behavioural level. This has been implemented, ensuring that the message passing is consistent with the structural model of the system (so that there is a physical medium for transmission of the message).

 

There are two simulators working together to execute three categories of model. The existing electrical and state chart based simulators manage the electrical and component behaviours from the earlier versions of AutoSteve and now also state based models for the network interaction between components. The simulation of systems with significant software based components led to the realisation that there was a need for a closer level of co-ordination between these simulators. One important example of this problem was the desire to model collisions between network messages. It was found that if one of the messages was generated from a part of the system that made use of electrical simulation (to connect a switch to an Electronic Control Unit or ECU) then the collision was not simulated. The simulators had no common notion of time, so that although the electrical changes could be regarded as instantaneous, this did not mean they took place in the same qualitative time slot as the instantaneous state chart events. This led to an architecture for mixed simulation where a top level “manager” is used to co-ordinate the use and timing of the individual simulators. In this case these are the electrical and state based ones, but the idea could readily be used with simulators for other domains (such as hydraulic).

 

With network protocols such as Controller Area Network (CAN), where collisions between messages occur, there is the problem that some messages will be delayed as they have to be retransmitted having been lost in a collision. It will be appreciated that the probability of this occurring depends on how heavily the network is being used, as well as, in CAN, the individual message’s priority level. Tools exist to develop probabilistic models for networks. However these cannot predict the end effect of a delayed message because they consider the network in isolation. Instead, the approach was taken to model late arrival of a network message as a component failure mode (in the same way as any other). This simplifies simulation and allows design analysis of systems that incorporate networks to be done early in the design process, independent of knowledge of the rest of the network system.   The engineer can then use the FMEA information to ensure message priority is high enough for critical messages and network loading precludes excessive delays by using the detailed network protocol specific modelling tools.

 

Improved function modelling

 

One area that was found to require more investigation than had originally been anticipated was the need for the design analysis tool to report failures of system functions that did not result in a new system state after a step in the simulation. For example, in a car’s lighting system switching the lamps on results (or should result) in a new system state that is maintained until further input causes the state to change again. However, many modern systems’ functionality depends on behaviour that ends with the system in the same state as it was before (that is, with the expected tasks completed). In the original design analysis tool, the simulation is run until the system settles into a steady state. The textual report is then generated, based on the state (or outputs) of the system associated with steady state.

 

This is adequate for simple cases. However, the increasing use of software allows the incorporation of more temporally complex behaviours into electrical systems, as was shown by one of the case studies (a seat belt minder system) that lit a warning lamp and intermittently sounded a chimer several times before falling silent. Similar cases are found in any system where a sensor periodically sends its readings to a control unit. 

 

Functional models are an important element in many design analysis tools since they allow large quantities of low level simulation data to be organised and presented at a level of abstraction that an engineer can understand. The project has produced developments in functional modelling research and there are several further contributions to be published in the near future.  These ideas have been included in the functional modelling language developed at Aberystwyth providing enhancements in two main areas. These allow for functions that represent complex behavioural sequences and interpretation of functions that include timing constraints. The utility of the concepts have been demonstrated on the relevant project case studies. 

 

Sequences

 

The original language used for interpretation of system behaviour (the “functional labelling” of the system) used logical operators AND, OR, NOT and XOR to allow a system function to be decomposed into individual component outputs. The language has been extended with new operators to allow such intermittent and sequential behaviours to be described[1]. These operators are known as SEQ and L-SEQ and are used to specify cases where the succeeding system state should immediately follow the preceding state and where the succeeding state follows the preceding state, but the system might enter other states between those specified. These operators are related to the O and F operators found in some temporal logics. It is also necessary to specify cases where the system enters a cycle (so the simulation step is only terminated by some other input), and additional operators CYCLE and END-CYCLE have been introduced. The END-CYCLE operator is used to label the last state in a cycle, after which the system is expected to return to the state specified after the CYCLE operator. Individual system states that make up the cycle are specified using the SEQ or L-SEQ operators.

 

In addition to the need for these extensions to the functional labelling, the modelling of such behaviours means that the simulator itself (or the simulation manager) must both provide a description of the successive system states for interpretation and also be able to recognise that the behaviour is in a cycle. This has led to further refinement of the design analysis architecture and includes a simulation manager that assembles the description of the system states. This can then be separated from the interpretation module, which abstracts the description of system states (expressed in terms of a list of state of the components). The separation of elements of the design analysis tool (the simulation manager from the simulators and both from the report generator) should allow alternative elements to be inserted such as different domain based simulators and different task based interpretation (or report generation) modules.

 

 

 

Timing constraints

 

A second aspect of increased functional complexity is the possibility of functions being achieved correctly, but not in time. The use of networks can lead to this problem, especially with protocols such as the widely used CAN and similar protocols in which messages might collide and the higher priority message will be transmitted, with later retransmission of the lower priority one.

 

An extension to the existing functional labelling language has been developed to allow the late (or early) achievement of expected system functions to be identified, and to be done in such a way that failures are distinguished from the failure of the system function to be achieved at all. The concept of “temporal constraints” is introduced to refine the link between the system states (or outputs) and achievement of the required system function. These will typically be a deadline before which the goal sate is to be achieved (such as “(left headlamp dipped and right headlamp dipped) before 100 milliseconds”) but could also be a time before which the state should not be achieved. This is necessary where sequential behaviours are described, to ensure that each stage in the sequence lasts a sufficient length of time. The entry to a system state can be associated with a before time (before which it must be achieved) and / or an after time (before which it should not be achieved). 

 

 

Software FMEA

One valuable development that has sprung from the project as an alternative to the high level modelling of software behaviour is an approach to carrying out FMEA on the software itself[2]. This uses a static analysis of the software, which might be obtained either from the actual source code or from a graphical model (such as might be created using MATLAB), to trace the paths of possible erroneous values through the software to show what parts of the program (and so the system of which it is a part) will be affected by such errors. Manual approaches to this type of analysis of embedded software can be found in the literature and prove to be too costly to perform on all but the most critical (avionics) systems. It is the extension of the technologies present in this work that might provide the exciting opportunity to automate the analysis and make it available to a wider range of software applications development. The technique not only allows FMEA of software but can also be used to check for any differences of fault impact between the intended design and the actual code. 

 

A novel design verification technique

A side benefit of the project is the opportunity it has given for the investigation of a novel design analysis (verification) tool where all the qualitative system behaviours (both expected and unexpected) can be generated directly from the system design and automatically simplified[3].  Behaviours can be explored and presented visually to the designer from a functional perspective so as to help them understand the consequences of the way they have implemented the system’s functionality.  Unforeseen operational scenarios are a problem exacerbated by the ease with which software can be used to introduce system states. The result is difficult to find unexpected or inconsistent states and subsequent ‘system features’ or malfunctions. The method has the potential to provide broad coverage of the system behaviour and can be used to identify problems that testing might not have located.  We are negotiating with Mentor Graphics to create an implementation of the technique based around their Capital Analysis toolset.

 

Project plan review

 

All of the activities in the original work plan were carried out in full with the exception of the proposed investigation into automatic generation of fault mitigation software. It became clear from our investigations into the use of fault tolerant systems in cars (which are becoming more important with the introduction of so called “drive by wire” systems) that there is no common approach to fault tolerant behaviour, each system is designed with its own specific approach, such as responding to a sensor failure by reverting to a “limp home” mode of reduced functionality (such as reduced engine efficiency or performance) and having a warning function so the driver is aware of the loss of functionality. This approach does not lend itself to automatic generation of software because the fault tolerant behaviour is very specific to the system – the design must capture the designer’s knowledge of the likely consequences of a given failure. 

 

The project did lead to two closely related areas of research not explicitly in the plan.  The first is a technique that can assist an engineer to verify that the behaviour of a system does not contain unexpected states or functionality. The second was an investigation into the potential for a failure analysis of the software itself.  Papers have been published in both of these areas and future work is anticipated in both.

 

Research impact and benefits to society

 

The collaborators have all benefited from the project as follows:

 

FirstEarth: FirstEarth has incorporated facilities for carrying out failure analysis of systems that incorporate software and network components into their design analysis tool, AutoSteve. This increase in the capability of their design analysis tool has consolidated its position as a world leader in the field of automated design analysis. FirstEarth has recently been acquired by Mentor Graphics and the design analysis tool has been integrated with their electrical CAD tool, Capital Logic, under the name Capital Design. This means that users of this tool have the facility to conduct automated design analysis as a part of the design process, giving Capital Logic a competitive advantage in that market.

 

Ford Motor Company: Since the incorporation of the advances made by the project into AutoSteve, Ford have been able to use the improved tool to undertake design analysis of an increased variety of electrical systems in a timely and cost-effective manner, leading to savings in development time and costs for systems that include software and network components.

 

Motor Industry Research Association: MIRA have also benefited from the availability of the improved design analysis tool by being able to undertake automated design analysis of an increased range of systems[4].

 

The improvements to the FirstEarth (now Mentor) design analysis tool mean that it is keeping pace with the increased use of software and networking technology in the automotive sector and also in other areas, such as aerospace, where such mixed electrical and software systems are in use.

 

Explanation of expenditure

 

Staff

The only post funded by the project was for a postgraduate research associate. This post was filled by Jonathan Bell for the whole duration of the project.

 

Travel and subsistence, Equipment, Consumables

Expenditure on these was in line with the original proposal.

 

Further research and dissemination issues

 

There are two main avenues for further research that arise from the SoftFMEA project. The first of these is a fuller investigation of the approach to FMEA of software on which initial investigation has already been done. It is expected that this will lead to the development of a tool for automated analysis of embedded software. This is an increasingly important area and an EPSRC proposal is being prepared to develop the research and tools in this area in collaboration with Ford, MIRA and relevant software companies

 

The work on a design verification tool is also still at an early stage but has clear industrial application. While we can generate a map of states that a system might enter (simplified by the omission of extraneous information) we believe there is important research to be carried out, both on how the results are presented to the user to aid understanding and on how to compare the results of this map of states with descriptions of the intended behaviour of the system. It is expected that the improvements to the functional modelling language that have resulted from the present project will be of value in this area.  The qualitative simulation tools now owned by Mentor Graphics are an important foundation for this technique and we are in communication with Mentor regarding the possibility of implementing a proof of concept for this work within their tool suite.

 

Another area of research where the new functional language is expected be of value is in its use for diagnosis. We have done some work on abstract modelling for diagnosis and this work is expected to lead to the development of a tool for assistance in finding likely causes of a system failure whose symptoms are observed.

 

In addition to the papers and conference contributions documented in the IGR part 1 further papers are planned dealing with the untimely achievement of system functions, the architecture for the integration of different simulators (such as electrical and software) and the combination of these with the side of the design analysis tool that manages the interpretation of the simulators’ results and generation of the design analysis report.  The project RA (Jon Bell) is also expected to submit his PhD thesis within the next 6 months based on work inspired by the project.  Further technical details of the work carried out on the project can be found in the 29 technical reports available on the project web page below.

http://www.aber.ac.uk/compsci/Research/mbsg/fmeaprojects/softFMEA.shtml

 

Referenced Publications

[1] J. Bell, N. Snooke, Describing System Functions that Depend on Intermittent and Sequential Behavior, Proceedings QR-04, pp.51-57, 2004.

[2] N. A. Snooke, Model-Based Failure Modes and Effects Analysis of Software, Procs DX04 pp. 221-226, Carcassonne, France, 23-25 June, 2004 

[3] N.A.Snooke and J.Bell, Abstracting Automotive System Models from Component-based Simulation with Multi Level Behaviour, Sixteenth International Workshop on Qualitative Reasoning pp. 151-160, ISBN 84-95499-60-6, June 10-12 2002, Barcelona Spain.

[4] D. Ward and C. J. Price, System functional safety through automated electrical design analysis, SAE 2001 Transactions, Journal of Passenger  Cars, Section 7 - vol 110: Electronic and electrical systems, pp341-347.


Additional Publications

 C.J.Price and P.Struss, Model based Systems in the Automotive Industry, AI Magazine Special Issue on Qualitative Reasoning, American Association for Artificial Intelligence Press, Winter 2004.

C.J.Price, N.A.Snooke, S.D.Lewis Adaptable Modelling of Electrical Systems, QR2003 - 17th International Workshop on Qualitative Reasoning. pp.147-153, eds Paulo Salles and Bert Bredeweg, Brasilia, Brazil, 20-22 August 2003.

Snooke, N A, Price, C J, & Ellis, D, Whole Lifecycle Electrical Design Analysis. In: Foresight Vehicle, Paper Number 02FCC108, SAE Future Car Congress, Washington D.C., June 2002.

N.A.Snooke and R.Shipman, Generating Automotive Electrical System Models from Component Based Qualitative Simulation, Expert Systems 2001, 11-12 Dec 2001, Cambridge UK, Research and Development in intelligent Systems XVIII, pp.100-114, ISBN 1-85233-535-1 (Springer).

C. J. Price and N. S. Taylor, Automated Multiple Failure FMEA, Reliability Engineering and System Safety 76(1), pp1-10, 2002.

D. Ward and C. J. Price, System functional safety through automated electrical design analysis, SAE 2001 Transactions, Journal of Passenger  Cars, Section 7 - vol 110: Electronic and electrical systems, pp341-347.
 
C. J. Price, Incremental automated diagnostics, Procs AAAI Spring Symposium on Information Refinement and Revision for Decision Making: Modeling for Diagnostics, Prognostics, and Prediction, Palo Alto, March 2002. 

C. J. Price, N. Hughes, Effective Automated Sneak Circuit Analysis, Procs Annual Reliability and Maintainability Symposium, pp356-360, Seattle, January 2002.