Skip to : [Content] [Navigation]
 

Originally Published MEM April 2001

Electronic Design

A Quantitative Method of Optimizing Electronic Device Design

A versatile randomizing stress test can quickly pinpoint design weaknesses, discovering failure points that might otherwise come to light only when the device is at work in the field.

Bob Jones

Sophisticated electronic devices are sometimes used by untrained operators, perhaps even in stressful conditions. Imagine a portable electrocardiograph (ECG)/ defibrillator in the hands of an airplane passenger pressed into service as a lifesaver. How can equipment be validated to operate under unexpected conditions? Are single-fault safety requirements sufficient? How can something be made foolproof?

A related issue arises when an electronic device is especially successful in finding willing users. If a product becomes popular, then it may very well be adapted to other uses. Here, a classic example is the beeper, or portable electronic pager. Originally used by service technicians, emergency response personnel, and the like, pagers worked well and reliably for those professionals. But a new use environment emerged when teenagers became users of pagers, one where the stresses were greater. The question raised by such a scenario is this: How can a product be validated for use in its environment when the stresses experienced by the product will change if it becomes popular and is put to new uses?

Finally, consider the fact that most medical electronic equipment is actually more than just electronic. Equipment can include solid-state electronics; mechanical devices such as printers, screens, and lids; and power circuitry, including batteries and chargers. Testing a solid-state circuit board for its ability to tolerate expected conditions requires one set of procedures, while a battery or printer may have to be subjected to a different set of tests. Could a system be developed that adequately tests diverse equipment elements simultaneously?

How can a medical electronic device be validated for use in potential stress environments that have not or cannot be anticipated? The answer is to focus the testing on conditions that can break the product rather than on what might happen under the conditions it is expected to encounter. This article describes and discusses a technique and test system that uses this approach and produces quantitative design validations in just a few days.

Failure-Mode Verification Testing

A design validation test has been devised to quantify an electronic product's state of development and operational potential—its design "maturity." Called failure-mode verification testing (FMVT), the process subjects the equipment under test (EUT) to all of its known stress sources—that is, sources of potential damage to the product—at the same time. Vibration, temperature extremes, humidity, mechanical loads, electrical loads, radiant heat, and pressure are some of the challenges FMVT introduces to an EUT.

These stresses are applied so they randomize their effects. For example, vibration would be delivered in a random six-axis profile, mechanical loads would be randomly relative to each other, and electrical loads would be randomized. The object is to apply all potential stresses simultaneously and unsystematically in order to distribute stress randomly throughout the EUT. The rate at which different failure modes accelerate in response to different stress sources is well documented.1

The FMVT process begins with the EUT exposed to stresses that simulate the maximum expectable service conditions. Over time, the stress levels are raised. And as the test progresses, the EUT experiences failures. These failures occur at locations in the device design that accumulate stress damage faster than other areas of the product.

The failure occurrences pinpoint the weak links in the design of the tested equipment. Because all of the known types of potential stress are applied to the device randomly and with intensity increasing over time, the order of occurrence of the failure modes is generally the order of their relative significance. The stress sources applied are raised in proportion to two reference points specific to the product—the service conditions (what the product would experience) and the destruct level (how much of a particular stress will destroy the product in a short period of time)—ensuring that the damage done by each type of stress is maintained approximately in proportion. The time to failure, therefore, can be taken to be in proportion, and the resulting failure-mode progression is representative of the order of failures that would be seen in the field.2

Figure 1. Plot of the failure-mode progression for an imaginary electronic product with a relatively immature design. The design maturity of the device is 0.42.

The distinct failure modes resulting from FMVT can be plotted as shown in Figure 1.

Quantifying Product Design Maturity

From the sequence and timing of the appearance of failure modes brought about by failure-mode verification testing, the maturity of a product's design can be quantified. The calculation is based on a pair of assumptions: first, that the product is feasible, meaning that, although the design may have some weaknesses, the design concept is viable and its design must only be iterated to be made to work; and second, that a robust, optimized product will accumulate stress damage evenly throughout so that when one part of its design fails, the rest of the design is near failure also. (A robust design is generally insensitive to normal small variations in material, manufacture, or usage, and an optimized design is one that achieves its mission efficiently in terms of cost and materials used—that is, it is not overdesigned.) Accepting these two assumptions allows a determination of the relative maturity of a product design from its FMVT failure-mode progression.

If the first failure mode occurs early in the test period and notably before the rest of the failure modes, then the product is immature and has plenty of room for improvement. If all of the failure modes appear only after a significant period of FMVT time and in close succession, then the product is mature. Figure 2 indicates a mature design as revealed by results of FMVT. If the first failure modes of this product were addressed in redesign, the equipment's projected operational life would not change perceptibly because the next failure mode occurred at nearly the same time as the first ones. Addressing the first failure mode in the device represented in Figure 1, however, would result in a significant improvement in that product. A product's potential for improvement can be quantified as its design maturity.

Figure 2. Plot of the failure-mode progression of a device with a mature design. The design maturity of this device is 0.02.

Design maturity (DM) is represented quantitatively as the average time between failures after the first failure divided by the time to the first failure. This calculation yields the average potential improvement in product life under the accelerated test that would be gained by fixing one failure mode. The device in Figure 1, which has a DM of 0.42, offers a potential improvement in life of 42% with the elimination of one failure mode, whereas the product in Figure 2 (DM = 0.02) has a potential for improvement of only 2%. Clearly, the second design is more mature; that is, it has less room for improvement.

Quantifying the Technological Limit of a System

The calculated maturity of a device design provides a measure of how much better the product could perform under accelerated stress conditions with the benefit of a redesign. However, its maturity tells only part of the story. A relative measure of a product's expected lifetime is also needed if products are going to serve as components in a complex piece of equipment.

Figure 3 shows the failure-mode progressions of a group of products that are assembled into a system. Although all of these system components have a significant potential for improvement (37 to 60%), they do not share a route to achievement of maturity. Product 3, for example, has a relatively large improvement potential of 49%, but it can achieve a lifetime in the accelerated stress environment of more than 300 minutes if just one failure mode is fixed. By comparison, product 4, which has a 60% potential for improvement, would only reach a life in the accelerated environment of around 175 minutes if one failure mode were fixed. Therefore, a quantification of the limit of a system's design potential is also needed. Such quantification gives the technological limit (TL) of the product.

Figure 3. Plot of the failure-mode progressions of four devices assembled into a system. The connecting line indicates the technological limits of the system components. If all these devices were iterated to achieve their TLs, the system DM would be 0.13.

The TL calculation involves removing failure modes in the order of their occurrence and each time recomputing the DM until it is less than 0.1. Once the DM drops below 0.1, the time of the first remaining failure mode—that is, the point at which it appears in the FMVT process—is the technological limit. In the case of product 1 in Figure 3, eliminating the first failure mode would result in a DM of 0.06. Therefore, the TL of product 1 is the time of its second failure mode, or 250 minutes. The TLs of components 1 through 4 of the system are 250, 240, 310, and 220, respectively.

Regarding TL as a function of DM, then, as a product's design is iterated, its technological limit is the best it is expected to become. In refining an electronic system, it is worthwhile only to address remaining failure modes that are below the system technological limit. How well balanced are the functional lifetimes of the components of a system if all the components are brought up to their technological limit? The component TL can be used to calculate the system design maturity. In the case of the system in Figure 3, it would be 0.13, meaning that replacing the component with the worst TL (product 4) with one not determined to be the first system element to fail would potentially improve the life of the system by 13%, assuming that all of the products constituting the system were first iterated to their technological limit.

FMVT Capabilities in Application

An FMVT machine features a broad frequency range and thus can stress products with high- and low-frequency vibrations. Using this capability, it activated both mechanical and electrical failure modes in an ECG under test (see Table I). Producing a broad spectrum of electromechanical failures maximizes the information provided to design engineers, expediting the redesign process.

Failure Mode
Step Number
Time to Failure (min)
Presumed Cause
Unit not functioning
6, 8
149
Battery may be drained to zero during low voltages.
Faint printing on sme features
11
257.2
May be old paper or a thermal effect of the paper.
Smudge line on paper
11, 12
257.2
Paper in printer vibrating against roller head receives smudge.
No power to power circuit
12
290.4
Fuse was blown, but other issues probably were the root cause; could not trace completely. Possible loss of connection.
Table I. Record of results produced by an electrocardiograph subjected to the FMVT process. Step numbers refer to stages of the test plan. Failure modes occurring at two test steps reflect cases where the failure point was repaired or replaced so the test could continue.

Judging from the distribution of failure modes found (and outlined in the table), the design of the ECG machine was reasonably mature. If the first failure mode were eliminated or ignored, then the unit would be deemed a mature design. This means that fixing any single additional point of failure would not improve the life of the unit significantly. The equipment had a DM of 0.32.

Another example involves a network hub that had passed all required validation tests but nevertheless had some inherent design weaknesses that were exposed in the field. Running the hub through the FMVT process reproduced those failure modes. FMVT turned up the following key facts about a product that had already been released to the market:

  • Several mechanical failure modes inherent in the product might have been uncovered had development testing included low-frequency vibration and significant displacement testing.
  • Temperature and humidity seemed to have adverse effects on the electronics, a definitive understanding of which would require further testing.
  • The primary problem for the network hub was the power connector.
  • The network cables failed near the end of the test.
  • Evidence of arcing was found between the network hub clamshell shield and the ground plane.

The product could have been more reliable if its design had been iterated and tested using a set of stress sources including low-frequency vibration and high displacement.

The design maturity of the network hub is 31.8. This indicates that the design could potentially improve by 3180% if one failure mode were fixed. This calculation is based on all of the failure modes found, both severe and minor. Two of the failure modes were soft failures, however (failures causing a temporary interruption of proper functioning without damaging the product). Taking just the hard failure modes into account, the DM of the hub is 63.5, meaning that fixing one failure mode offers a potential of improving the design by 6350%. The DM value is high because the first failure mode appeared only 187 seconds into the test. The assessment of risk relating to this first failure mode was low; although a hard failure, a simple change in the design of a power connection would eliminate the failure mode and make it easy to avoid in the field. If this source of failure were fixed or ignored, the DM of the network hub, calculated on hard failures, drops to 0.16; the design has a potential of improving by 16% as a result of fixing one failure mode.

An important fact to note about the FMVT trial of the hub is that its hard failure modes were all mechanical (the two soft failures were electronic); the real damage to the unit was sustained by its larger mechanical parts—the power connector, network cable connector, and clamshell. How could such design flaws have been missed during development testing? The answer is because the solid-state electronics of the product were screened using air hammer technology. Air hammer vibration does not produce significant low-frequency energy. Also, because the main objective of using an air hammer vibration machine on an electronic component is to find any failure modes in the electronics, the circuit board is typically mounted directly to the table, without the plastic housing. For this reason, failure modes associated with low-frequency resonance—involving items such as connectors, clips, and cases or clamshells— would not be found. It is not surprising then that an electronic product that was developed with the use of air hammer technology had significant mechanical failure modes but no hard electronics failures.

Conclusion

The operational testing called for by FDA validation requirements is designed to certify that a design is ready for market. It is not designed primarily to provide feedback to an engineer for product improvement. Using FDA testing as a design iteration tool, therefore, is an inefficient application of resources. Though nominally a more expensive testing approach, FMVT, viewed in this light, can save time and money.

Using FMVT to iterate a product design before doing certification testing can avoid repeating that testing. Because products often fail some of the tests, such repetition is common, which can be an expensive way to iterate a design. FMVT allows a manufacturer to plan on one pass through the certification testing.

FMVT has been used on a variety of products ranging from medical electronics to automotive components to kitchen utilities. In addition to offering flexibility and speed, FMVT has been able to reproduce failure modes from warranty and field incidents that other tests could not reproduce.


REFERENCES

1. Wayne Nelson, Accelerated Testing: Statistical Models, Test Plans, and Data Analysis (New York: John Wiley, 1990).

2. Alexander J Porter, "Does High Reliability Equal Zero Defects?" (paper presented at the Sixth Annual Northeast Product Safety Society Vendors Night, Boxboro, MA, November 15, 2000).

Alexander J. Porter is business development manager with Entela Inc. (Grand Rapids, MI). He can be reached at aporter@entela.com.