Originally Published MEM April 2001
Electronic Design
A Quantitative Method of Optimizing Electronic Device DesignA versatile randomizing stress test can quickly pinpoint design weaknesses, discovering failure points that might otherwise come to light only when the device is at work in the field.
Bob Jones
Sophisticated electronic devices are sometimes used by untrained operators,
perhaps even in stressful conditions. Imagine a portable electrocardiograph
(ECG)/ defibrillator in the hands of an airplane passenger pressed into
service as a lifesaver. How can equipment be validated to operate under
unexpected conditions? Are single-fault safety requirements sufficient?
How can something be made foolproof? A related issue arises when an electronic device is especially successful
in finding willing users. If a product becomes popular, then it may
very well be adapted to other uses. Here, a classic example is the beeper,
or portable electronic pager. Originally used by service technicians,
emergency response personnel, and the like, pagers worked well and reliably
for those professionals. But a new use environment emerged when teenagers
became users of pagers, one where the stresses were greater. The question
raised by such a scenario is this: How can a product be validated for
use in its environment when the stresses experienced by the product
will change if it becomes popular and is put to new uses? Finally, consider the fact that most medical electronic equipment is
actually more than just electronic. Equipment can include solid-state
electronics; mechanical devices such as printers, screens, and lids;
and power circuitry, including batteries and chargers. Testing a solid-state
circuit board for its ability to tolerate expected conditions requires
one set of procedures, while a battery or printer may have to be subjected
to a different set of tests. Could a system be developed that adequately
tests diverse equipment elements simultaneously? How can a medical electronic device be validated for use in potential
stress environments that have not or cannot be anticipated? The answer
is to focus the testing on conditions that can break the product rather
than on what might happen under the conditions it is expected to encounter.
This article describes and discusses a technique and test system that
uses this approach and produces quantitative design validations in just
a few days. Failure-Mode Verification Testing A design validation test has been devised to quantify an electronic
product's state of development and operational potentialits design
"maturity." Called failure-mode verification testing (FMVT), the process
subjects the equipment under test (EUT) to all of its known stress sourcesthat
is, sources of potential damage to the productat the same time.
Vibration, temperature extremes, humidity, mechanical loads, electrical
loads, radiant heat, and pressure are some of the challenges FMVT introduces
to an EUT. These stresses are applied so they randomize their effects. For example,
vibration would be delivered in a random six-axis profile, mechanical
loads would be randomly relative to each other, and electrical loads
would be randomized. The object is to apply all potential stresses simultaneously
and unsystematically in order to distribute stress randomly throughout
the EUT. The rate at which different failure modes accelerate in response
to different stress sources is well documented.1
The FMVT process begins with the EUT exposed to stresses that simulate
the maximum expectable service conditions. Over time, the stress levels
are raised. And as the test progresses, the EUT experiences failures.
These failures occur at locations in the device design that accumulate
stress damage faster than other areas of the product. The failure occurrences pinpoint the weak links in the design of the
tested equipment. Because all of the known types of potential stress
are applied to the device randomly and with intensity increasing over
time, the order of occurrence of the failure modes is generally the
order of their relative significance. The stress sources applied are
raised in proportion to two reference points specific to the productthe
service conditions (what the product would experience) and the destruct
level (how much of a particular stress will destroy the product in a
short period of time)ensuring that the damage done by each type
of stress is maintained approximately in proportion. The time to failure,
therefore, can be taken to be in proportion, and the resulting failure-mode
progression is representative of the order of failures that would be
seen in the field.2 The distinct failure modes resulting from FMVT can be plotted as shown
in Figure 1. Quantifying Product Design Maturity From the sequence and timing of the appearance of failure modes brought
about by failure-mode verification testing, the maturity of a product's
design can be quantified. The calculation is based on a pair of assumptions:
first, that the product is feasible, meaning that, although the design
may have some weaknesses, the design concept is viable and its design
must only be iterated to be made to work; and second, that a robust,
optimized product will accumulate stress damage evenly throughout so
that when one part of its design fails, the rest of the design is near
failure also. (A robust design is generally insensitive to normal small
variations in material, manufacture, or usage, and an optimized design
is one that achieves its mission efficiently in terms of cost and materials
usedthat is, it is not overdesigned.) Accepting these two assumptions
allows a determination of the relative maturity of a product design
from its FMVT failure-mode progression. If the first failure mode occurs early in the test period and notably
before the rest of the failure modes, then the product is immature and
has plenty of room for improvement. If all of the failure modes appear
only after a significant period of FMVT time and in close succession,
then the product is mature. Figure 2 indicates a mature design as revealed
by results of FMVT. If the first failure modes of this product were
addressed in redesign, the equipment's projected operational life would
not change perceptibly because the next failure mode occurred at nearly
the same time as the first ones. Addressing the first failure mode in
the device represented in Figure 1, however, would result in a significant
improvement in that product. A product's potential for improvement can
be quantified as its design maturity. Design maturity (DM) is represented quantitatively as the average time
between failures after the first failure divided by the time to the
first failure. This calculation yields the average potential improvement
in product life under the accelerated test that would be gained by fixing
one failure mode. The device in Figure 1, which has a DM of 0.42, offers
a potential improvement in life of 42% with the elimination of one failure
mode, whereas the product in Figure 2 (DM = 0.02) has a potential for
improvement of only 2%. Clearly, the second design is more mature; that
is, it has less room for improvement. Quantifying the Technological Limit of a System The calculated maturity of a device design provides a measure of how
much better the product could perform under accelerated stress conditions
with the benefit of a redesign. However, its maturity tells only part
of the story. A relative measure of a product's expected lifetime is
also needed if products are going to serve as components in a complex
piece of equipment. Figure 3 shows the failure-mode progressions of a group of products
that are assembled into a system. Although all of these system components
have a significant potential for improvement (37 to 60%), they do not
share a route to achievement of maturity. Product 3, for example, has
a relatively large improvement potential of 49%, but it can achieve
a lifetime in the accelerated stress environment of more than 300 minutes
if just one failure mode is fixed. By comparison, product 4, which has
a 60% potential for improvement, would only reach a life in the accelerated
environment of around 175 minutes if one failure mode were fixed. Therefore,
a quantification of the limit of a system's design potential is also
needed. Such quantification gives the technological limit (TL) of the
product. The TL calculation involves removing failure modes in the order of
their occurrence and each time recomputing the DM until it is less than
0.1. Once the DM drops below 0.1, the time of the first remaining failure
modethat is, the point at which it appears in the FMVT processis
the technological limit. In the case of product 1 in Figure 3, eliminating
the first failure mode would result in a DM of 0.06. Therefore, the
TL of product 1 is the time of its second failure mode, or 250 minutes.
The TLs of components 1 through 4 of the system are 250, 240, 310, and
220, respectively. Regarding TL as a function of DM, then, as a product's design is iterated,
its technological limit is the best it is expected to become. In refining
an electronic system, it is worthwhile only to address remaining failure
modes that are below the system technological limit. How well balanced
are the functional lifetimes of the components of a system if all the
components are brought up to their technological limit? The component
TL can be used to calculate the system design maturity. In the case
of the system in Figure 3, it would be 0.13, meaning that replacing
the component with the worst TL (product 4) with one not determined
to be the first system element to fail would potentially improve the
life of the system by 13%, assuming that all of the products constituting
the system were first iterated to their technological limit. FMVT Capabilities in Application An FMVT machine features a broad frequency range and thus can stress
products with high- and low-frequency vibrations. Using this capability,
it activated both mechanical and electrical failure modes in an ECG
under test (see Table I). Producing a broad spectrum of electromechanical
failures maximizes the information provided to design engineers, expediting
the redesign process. Judging from the distribution of failure modes found (and outlined
in the table), the design of the ECG machine was reasonably mature.
If the first failure mode were eliminated or ignored, then the unit
would be deemed a mature design. This means that fixing any single additional
point of failure would not improve the life of the unit significantly.
The equipment had a DM of 0.32. Another example involves a network hub that had passed all required
validation tests but nevertheless had some inherent design weaknesses
that were exposed in the field. Running the hub through the FMVT process
reproduced those failure modes. FMVT turned up the following key facts
about a product that had already been released to the market: The product could have been more reliable if its design had been iterated
and tested using a set of stress sources including low-frequency vibration
and high displacement. The design maturity of the network hub is 31.8. This indicates that
the design could potentially improve by 3180% if one failure mode were
fixed. This calculation is based on all of the failure modes found,
both severe and minor. Two of the failure modes were soft failures,
however (failures causing a temporary interruption of proper functioning
without damaging the product). Taking just the hard failure modes into
account, the DM of the hub is 63.5, meaning that fixing one failure
mode offers a potential of improving the design by 6350%. The DM value
is high because the first failure mode appeared only 187 seconds into
the test. The assessment of risk relating to this first failure mode
was low; although a hard failure, a simple change in the design of a
power connection would eliminate the failure mode and make it easy to
avoid in the field. If this source of failure were fixed or ignored,
the DM of the network hub, calculated on hard failures, drops to 0.16;
the design has a potential of improving by 16% as a result of fixing
one failure mode. An important fact to note about the FMVT trial of the hub is that its
hard failure modes were all mechanical (the two soft failures were electronic);
the real damage to the unit was sustained by its larger mechanical partsthe
power connector, network cable connector, and clamshell. How could such
design flaws have been missed during development testing? The answer
is because the solid-state electronics of the product were screened
using air hammer technology. Air hammer vibration does not produce significant
low-frequency energy. Also, because the main objective of using an air
hammer vibration machine on an electronic component is to find any failure
modes in the electronics, the circuit board is typically mounted directly
to the table, without the plastic housing. For this reason, failure
modes associated with low-frequency resonanceinvolving items such
as connectors, clips, and cases or clamshells would not be found.
It is not surprising then that an electronic product that was developed
with the use of air hammer technology had significant mechanical failure
modes but no hard electronics failures. Conclusion The operational testing called for by FDA validation requirements is
designed to certify that a design is ready for market. It is not designed
primarily to provide feedback to an engineer for product improvement.
Using FDA testing as a design iteration tool, therefore, is an inefficient
application of resources. Though nominally a more expensive testing
approach, FMVT, viewed in this light, can save time and money. Using FMVT to iterate a product design before doing certification testing
can avoid repeating that testing. Because products often fail some of
the tests, such repetition is common, which can be an expensive way
to iterate a design. FMVT allows a manufacturer to plan on one pass
through the certification testing. FMVT has been used on a variety of products ranging from medical electronics
to automotive components to kitchen utilities. In addition to offering
flexibility and speed, FMVT has been able to reproduce failure modes
from warranty and field incidents that other tests could not reproduce.
REFERENCES 1. Wayne Nelson, Accelerated Testing: Statistical Models, Test Plans,
and Data Analysis (New York: John Wiley, 1990). 2. Alexander J Porter, "Does High Reliability Equal Zero Defects?"
(paper presented at the Sixth Annual Northeast Product Safety Society
Vendors Night, Boxboro, MA, November 15, 2000). Alexander J. Porter is business development manager with Entela
Inc. (Grand Rapids, MI). He can be reached at aporter@entela.com.

Figure 1. Plot of the failure-mode progression for an imaginary
electronic product with a relatively immature design. The design
maturity of the device is 0.42.

Figure 2. Plot of the failure-mode progression of a device with
a mature design. The design maturity of this device is 0.02.

Figure 3. Plot of the failure-mode progressions of four devices
assembled into a system. The connecting line indicates the technological
limits of the system components. If all these devices were iterated
to achieve their TLs, the system DM would be 0.13.
Unit not functioning
Battery may be drained to zero during low voltages.
Faint printing on sme features
May be old paper or a thermal effect of the paper.
Smudge line on paper
Paper in printer vibrating against roller head receives smudge.
No power to power circuit
Fuse was blown, but other issues probably were the root cause;
could not trace completely. Possible loss of connection.
Table I. Record of results produced by an electrocardiograph
subjected to the FMVT process. Step numbers refer to stages of the
test plan. Failure modes occurring at two test steps reflect cases
where the failure point was repaired or replaced so the test could
continue.



