Testability.com
Reference
The Information Source for Systems Testability and Diagnostics

Diagnostics, Testability and Diagnosability Terminology

Ambiguity Group – A collection of functions or failure modes for which diagnostics can detect a fault and can isolate the fault to that collection, yet cannot further isolate the fault to any subset of the collection.

Ambiguity Group Size – The number of unique repair items that are associated with the functions and failure modes that comprise a given ambiguity group.

Automatic Self-Test –Self-test to that degree of fault detection and isolation which can be achieved entirely under computer control, without human intervention.

Automatic Test – That performance assessment, fault detection, diagnosis, isolation, and prognosis which is performed with a minimum of reliance on human intervention. (MIL-STD-1309C)

Automatic Test Equipment (ATE) Computer-driven hardware that has been designed to test integrated circuits. Sometimes referred to as "the tester."

Benign Failure – A failure whose severity is slight enough to be outweighed by the advantages to be gained by normal use of the system.

Black Box – A system or component whose inputs, outputs, and general function are known but whose contents or implementation are unknown or irrelevant. (See also white box, grey box, and glass box).

Boundary Scan – A technique of testing dense electronics circuits created by the Joint Test Action Group (JTAG), whose name is often used synonymously with Boundary Scan, that creates "virtual test-points" on a board that does not have test point access physically built in. Boundary Scan was adopted as standard IEEE 1149.1.

Built-In Self-Test (BIST) – Tiny tester models that allow an integrated circuit to test itself.

Built-In Test (BIT)An internal automatic or semiautomatic feature in a system or subsystem designed to detect, diagnose or isolate system failures by interrogating a system or monitoring system performance.

Built-In Test Equipment (BITE) – Any device which is part of an equipment or system and is used for the express purpose of testing the equipment or system. (MIL-STD-1388-1). Hardware which is identifiable as performing the built-In test function; a subset of BIT (MIL-STD-2165).

Common Cause – An event or mechanism that can cause two or more failures or failure indications to occur simultaneously.

Common Cause Isolation – Also called Single Fault Isolation. A method of fault isolation that proceeds under the assumption that all "failed" tests in a diagnostic session are due to a single malfunction or fault. Although frequently used as the basis for diagnostic analyses (such as testability) during the design phase, c ommon cause isolation cannot consistency produce accurate results when multiple components are simultaneously malfunctioning during diagnosis. The degree to which diagnostic accuracy suffers depends on the likelihood of multiple failures (taking into consideration the possibility of dependent failures and the length of the interval between diagnostic sessions) and the extent to which single failures can produce the same test signature as multiple failures.(Compare with Multiple Failure Isolation).

Common Mode Failure – A failure of apparently independent components or communication links due to an initiating event that affects them all.

Design for Test (DFT) – The practice of adding hardware hooks to integrated circuits in order to facilitate effective, inexpensive testing.

Design for Testability – The aspects of the product design process whose goal is to ensure that the testability of the end product is competently and sufficiently developed.

Detection Mechanism – The means of methods by which a failure can be discovered by an operator under normal system operation or can be discovered by the maintenance crew by some diagnostic action.

Diagnosis – The process of identifying the nature or cause of a phenomenon.

Diagnostic Accuracy – A measure of the ability of a diagnostic strategy or procedure to detect and isolate failures correctly across all of the potential fault combinations for a given system, device or process. Diagnostic Accuracy—which takes into consideration the possibility of simultaneously malfunctioning components—is expressed as the probability-weighted ratio of fault combinations for which all faults can be detected and isolated to fault groups that contain them, to the total number of possible fault combinations. (Compare with Isolation Accuracy)

Diagnostic Ambiguity – The situation that arises when diagnostics are able to detect a fault, but the smallest fault group to which that fault can be isolated contains more than one repair item.

Diagnostic Coverage – The ratio of failures detected (by a given test program, test procedure, or set of tests) to the entire theoretically-detectable failure population, expressed as a probability-weighted percentage. Also called Fault Coverage or Fault Detection Coverage.

Diagnostic Engineering – The engineering discipline through which the diagnostic capability of a system or device is developed, assessed and optimized.

Diagnostic Health Monitoring – A method of performing diagnostics in which status data (sensor readings, test results, etc.) gathered from a health management system (e.g., IVHM) is used as a batch to "seed" a dynamic diagnostic session, to automatically respond to tests in a static diagnostic procedure, or to develop a customized diagnostic procedure to be applied to the given fault combination.

Diagnostic Inference Engine – An inference engine that draws diagnostic conclusions.

Diagnostic Procedure – A structured combination of tasks, tests, observations, and other information used to localize a fault or faults.

Diagnostic Reasoner – A reasoning system used to perform diagnostics.

Diagnostic Session – The single application of a diagnostic procedure or diagnostic reasoner to diagnose a fault. (See also iterative diagnostics).

Diagnostic Strategy – The computational logic associated with making automated diagnostic assessments from a set of observations or data.

Diagnostics – The process of correlating the results of multiple tests to determine overall system status and generating  hypotheses for maintenance and/or remediation.

Dynamic Diagnostics– Diagnostics in which the order of testing is determined during run time based on the current mission objectives, component MTTFs, and available resources (test equipment, personnel). Fault groups are also developed dynamically, with the diagnostics interpreting each test as it is performed. Although dynamic diagnostics offer great flexibility and efficiency, they can only be verified inductively. (Compare with Static Diagnostics and Semi-Dynamic Diagnostics).

Embedded Diagnostics – Any portion of the system's diagnostic capability that is an integral part of the prime system or support system.

External Diagnostics – Any portion of the entire system's diagnostic capability that is not embedded.

Failure – The loss of ability of a system, device or process to perform a required function. The manifestation of a fault. (See also Hardware Failure and Software Failure).

Failure Latency  – The elapsed time between fault occurrence and failure indication. (MIL-STD-2165)

Failure Mode – The characteristic manner in which a failure occurs. Within a failure mode diagnostic model, failure modes represent specific ways in which a system, device or process can fail. (Also see Object Failure Mode).

False Alarm – A warning reported by a diagnostic or health management system indicating the existence of an operational fault when that fault does not exist. False alarms can be reduced or sometimes completely eliminated by a full and accurate diagnostic analysis and validated run-time diagnostics / health management process.

False Removal – The removal of a good part suspected as being defective due to inconclusive diagnostics (e.g., diagnostic ambiguity), inaccurate diagnostic information, inefficient IETM information, or inadequate maintainer training. False removals contribute to high spares consumption, high turn-around times, low operational availability, and high RTOK rates.

Fault Combination – The set of faults that exist at the time of a diagnostic session.

Fault Coverage– The ratio of failures detected (by a given test program, test procedure, or set of tests) to the entire theoretically-detectable failure population, expressed as a probability-weighted percentage. Also called Diagnostic Coverage or Fault Detection Coverage.

Fault Detection – The process of identifying and reporting the presence of one or more faults within a system, device or process.

Fault Detection Coverage – The ratio of failures detected (by a given test program, test procedure, or set of tests) to the entire theoretically-detectable failure population, expressed as a probability-weighted percentage. Also called Diagnostic Coverage or Fault Coverage.

Fault Detection (FD) Metrics – Metrics that quantify the likelihood of a fault being detected, including Fault Coverage.

Fault Detection, Isolation and Remediation (FDIR) Metrics – Metrics that quantify the likelihood that a fault will be detected, isolated and remediated.

Fault Group – A set of suspected faults (in terms of functions or failure modes) at some stage of a diagnostic session. (See also Isolated Fault Group)

Fault Injection – A technique whereby the effectiveness of tests and diagnostics can be assessed by creating a simulated fault in a design and seeing the effect of that simulated fault on circuit outputs, test results, and diagnostic conclusions.

Fault Isolation – The process of determining the location of a fault to the extent necessary to effect repair. (MIL-STD-721C)

Fault Isolation (FI) Metrics – Metrics that quantify the likelihood of a fault being isolated to a single component.

Fault Localization – The process of determining the approximate location of a fault.(MIL-STD-721C)

Fault Resolution – The process of resolving a fault, taking into account both the diagnostics (testing) and maintenance philosophy (replacement).

Fault Resolution (FR) Metrics – Metrics that quantify the likelihood of a fault being resolved with a single replacement.

Functional Testing – A testing methodology that focuses on the expected functionality of the item being tested (as opposed to focusing on failures reported by BIT). Also known as behavioral testing.

Glass Box – A system or component whose purpose or function is fully disclosed, yet for which implementation details are not available. Offers the diagnostic visibility of a white box, with the design protection of a black box. (See also grey box). In some arenas, the term glass box is used synonymously with white box.

Grey Box – A system or component for which some, but not all, knowledge of how it behaves (beyond the interface) has been disclosed. A grey box stands in the middle ground between black boxes and white boxes. (See also glass box). 

Hardware Failure – The inability of the equipment to perform its expected functions due to a condition caused by operational, maintenance, physical or other environments. (Compare with Software Failure).

Health Management (HM)– The capability of monitoring real-time sensors to determine the health and performance of a system, subsystem, device or process. It may or may not be hosted on the system being monitored.

Hierarchical Diagnostic Inference – An inference about a function based on its relationship to functions at another indenture level. For example, if a function of an assembly is determined to be fully operational, then all child functions on components operating within that assembly can be inferred to be fully operational. Conversely, a function on an assembly can be inferred to be fully operational when all child functions on components within that assembly have been determined to be fully operational.

Hybrid Diagnostic Inference – An inference about a function based on its relationship to failure modes that affect that function. Likewise, an inference about a failure mode based on its relationship to functions affected by that failure mode. For example, if a function is determine to be fully operational, then all failure modes that always or exclusively affect that function can be inferred to be non-present. Conversely, when all failure modes that can possibly affect a function have been determined to be non-present, then that function can be inferred to be fully operational.

Inference Engine – The component of a reasoning system that draws conclusions based on available information and knowledge.

Integrated Diagnostics (ID) – A structure design and management process to achieve the maximum effectiveness of a system's diagnostic capability by considering and integrating all related pertinent diagnostic elements. The process includes interfaces between design, engineering, testability, reliability, maintainability, human engineering, and logistic support analysis.

Integrated Electronic Technical Manual (IETM) A tool providing information a technician needs to perform a maintenance action, including (but not restricted to) test and repair procedures, parts information and theory. The U.S. DoD has identified the following six classes of IETM:

Integrated Health Management – A health management process that has been integrated into the system, or device whose health is being monitored/managed.

Integrated Vehicle Health Management (IVHM) – Vehicle health management. A term first used by NASA to indicate the integrated health management system on a space vehicle.

Isolated Fault Group – A fault group that represents the final results of a diagnostic session. In testability analysis, an isolated fault group is referred to as an ambiguity group.

Isolation Accuracy – A measure of the ability of a diagnostic strategy or procedure to isolate detected failures correctly across all of the potential fault combinations for a given system, device or process. Isolation Accuracy—which takes into consideration the possibility of simultaneously malfunctioning components—is expressed as the probability-weighted ratio of fault combinations for which all detected faults can be detected and isolated to fault groups that contain them, to the total number of possible fault combinations containing at least one detected fault. (Compare with Diagnostic Accuracy)

Iterative Diagnosis – The process of diagnosing a system by iteratively running a diagnostic engine and taking the appropriate corrective action for the fault identified by each iteration. Iterative diagnostics do not attempt to isolate all known failures with each iteration, but rather to isolate a single failure to the extent necessary to effect repair. 

Lambda Search – A method for prioritizing the replacement of repair items in an isolated ambiguity group, based on isolated failure probability. Testability analyses often include both fault isolation metrics (based on testing only, without the use of lambda search) and fault resolution metrics (which take into consideration both testing and serial replacement prioritized using a technique like lambda search).

Model-Based Diagnostics – Diagnostics performed using a Model-Based Reasoner. Not to be confused with diagnostics based on a Diagnostic Dependency Model.

Model-Based Reasoner (MBR) – An algorithm that creates diagnostic conclusions by comparing measured values against propositions created from a model that is dynamically adjusted to match changes to the system.

Multiple-Failure Isolation – A method of fault isolation that takes into consideration the possibility of multiple, simultaneous malfunctions during diagnostics. The likelihood of multiple faults existing during diagnostics is influenced by several criteria: the failure rates associated with individual faults, the likelihood of dependent failures, and the length of the time interval between diagnostic sessions. The diagnostic accuracy of procedures that incorporate multiple-failure isolation is generally higher than those based on common cause isolation. Although multiple-failure isolation can result in greater diagnostic ambiguity than does common cause isolation, the ambiguity groups that result from multiple-failure isolation often lend themselves to prioritized replacement, thereby negating or reducing the effect of multiple-failure isolation upon diagnostic performance.

Off-line Testing – The testing of an item with the item removed from its normal operational environment. A testing approach that requires a system to be taken out of service while tests are performed.

On-line Testing – The testing of a system item using active processing to detect faults while the system as a whole continues to provide its normal services.

Prognostics, or Predictive Diagnostics – The process of predicting the occurrence of failures to a system, device or process based on predictable time domain failures (such as mechanical wear). This is in contrast to non-time domain, random failures that cannot be prognosed using known technology (such as electronic failures).

Prognostic Health Management (PHM) – An approach to Health Management that incorporates prognostic, as well as diagnostic, techniques.

Real Time Diagnostics – A type of diagnostic algorithm that generates diagnostic conclusions rapidly enough to provide an immediate benefit to a system during operation. Where algorithms become too complex for real-time response, predictive diagnostics can be used supplementally.

Reasoning System – A system that can combine elements of information and knowledge to draw conclusions. (IEEE 1232-2002)

Remediation – The act of correcting an operational failure through switching to a redundant system or a comparable function, or changing the operational characteristics (such as a mission plan). Maintenance of the failure functional item is deferred to a more suitable time (such as during ground support).

Remote Diagnostics – A type of external diagnostics where the diagnostic decision-making is done remotely to where the testing is performed.

Repair Item – One or more parts considered as a single part for the purposes of replacement and repair due.

Run-Time Diagnostics – Diagnostics performed to diagnose faults in a fielded system, device or process.

Semi-Dynamic Diagnostics – Diagnostics in which fault groups are developed dynamically (during run-time), yet which contains no mechanism for dynamically selecting the order of testing. Although semi-dynamic diagnostics may be based on a predetermined test order, the maintainer may skip tests, postpone tests, or otherwise perform tests out of order without impacting diagnostic accuracy. Although not as flexible as dynamic diagnostics, semi-dynamic diagnostics are more adaptable to changing maintenance environments than are statistic diagnostics. Furthermore, like static diagnostics, the predetermined test order allows the diagnostics to be relatively easily verified. 

Single Fault Isolation – Also called Common Cause Isolation. A method of fault isolation that proceeds under the assumption that all "failed" tests in a diagnostic session are due to a single malfunction or fault. Although frequently used as the basis for diagnostic analyses (such as testability) during the design phase, single fault isolation cannot consistency produce accurate results when multiple components are simultaneously malfunctioning during diagnosis. The degree to which diagnostic accuracy suffers depends on the likelihood of multiple failures (taking into consideration the possibility of dependent failures and the length of the interval between diagnostic sessions) and the extent to which single failures can produce the same test signature as multiple failures. (Compare with Multiple-Failure Isolation).

Static Diagnostics – Diagnostics which are based on a predetermined test order and for which the diagnostic conclusions associated with each possible test signature have been pre-computed. With static diagnostics, tests may not be skipped, postponed, or otherwise performed out of order. The main benefit of static diagnostics is that they are relatively easily verified. (Compare with Dynamic Diagnostics and Semi-Dynamic Diagnostics).

Software Failure – The inability of the software to perform any further functions as the result of a software fault. (Compare with Hardware Failure).

Software Fault – A bug or defect in the code that causes the software to perform incorrectly.

System Health Management (SHM) – The capability of monitoring real-time sensors to determine system health and performance.

Test Signature  – The set of test results (Pass/Fail) that are generated when diagnostics are performed for a given fault combination.

Testability – A design characteristic which allows the status (operable, inoperable, or degraded) of an item to be determined and the isolation of faults within the item to be performed in a timely manner (MIL-STD-2165). A design characteristic that allows its operational status to be determined and the isolation of faults to be performed efficiently (IEEE Std 1522).

Testability Analysis – The engineering practice associated with evaluating the testability of a system, device or process.

Test Program Set (TPS) – The combination of test program, interface device, test program instruction, and supplementary data required to initiate and execute a given test of a UUT. (MIL-STD-2165)

Test Requirements Document (TRD) – An item specification that contains the required performance characteristics of a UUT and specifies the test conditions, values (and allowable tolerances) of the stimuli, and associated responses needed to indicate a properly operating UUT. (MIL-STD-2165)

Testing – The process of inferring behavioral properties of a product on the basis of execution in a known environment with selected inputs and checking results.

Unit Under Test (UUT) – An item for which testability design and diagnostics must be developed.

Vehicle Health Management (VHM) – An integrated health management system on a vehicle that monitors/manages that vehicle's health.

Web-Based Diagnostics – A type of remote diagnostics where interaction with the diagnostic engine is achieved through a web interface.

White Box – A system or component whose contents and implementation are fully disclosed. (See also black box, grey box and glass box).