XYFIT
1.0
Statistical
data evaluation:
It is a painful work to calculate data tables to
extract the objective information we need. The correlation of data usually has to be
proved by an experiment. In production processes we know almost well, which data
are correlated. But we often do not know anything about failures and why values
are changing by time. Most of these effects are very rarely, unpredictable and
hidden in the data noise. Therefore we need statistical methods to identify the
problem.
Usually we want to
characterize a device and to specify its values. Already here we can run into
severe problems, because we have almost no knowledge about the real distribution
of the device data. We almost assume all data follow a normal distribution or
can be transformed to a normal distribution. In reality the data can have
failures or multiple distributions. Before we calculate the data, we have to remove the failing
devices, which are not normal, which are not in agreement to the statistics. A
probability plot helps us to identify the devices,
which have failed. Then we can trace these bad devices in other tests and how
are they distributed in other tests. Are they grouped (correlated) or are they
randomly distributed. The correlated tests almost point to the same error.
They lead us to an overstressed and weak part of the device or to a test failure.
So we can learn a lot from failures before we remove them.
The Excel Add on (XYFIT 451) helps to get an
overview over the statistics of many tests, to identify the weakness by
statistical methods. We will demonstrate this now with an example of an
electronic device and it's test data and we will show how to evaluate the test
data using distribution plots.
1.1
Distribution plots
Unfortunately we don’t know very much about the
distribution of our test data. Are they as we expect, do they show a multiple
distribution or any other type of distribution? Therefore we start to create a cumulative normal distribution
plot, which is independent of the type of distribution and linear for a normal
distribution. The data Xi are sorted and weighted with the error-function. The
Y-axis of the cumulative distribution is scaled in Sigma (the standard width of
a normal distribution) and the X-axis refers to the linear scaled data values.
The statistical values of
these plots usually cannot be used to characterize the devices, because they
contain failures or multiple distributions. A perfect normal distribution
shows all data near a line, and failing parts are far away from this line. Other
distribution types often can be transformed to a normal distribution (e.g. a log
normal distribution).

The picture 1 shows three general types of distributions, a double
distributions, normal distribution and failures.
When we have hundreds of tests to evaluate, we are interested to find very
quickly the bad distributions. A characterization in different types of
distributions will help to sort the tests. Therefore we have to find a value,
which must be positive, without a dimension and should be in a range, which can
be displayed in bar chart. Many statistical values like the Mean, Standard
Deviation, CV, and others are not helpful, because they can be infinite or
negative or their range is too wide.
The distribution ratio Sigma over
1Sigma "S/1S" or ( Stdev(100%)/Stdev(68%) is ideal to generate this
overview over the test data. The 1 Sigma value is the Sigma value of the
distribution in the 1 Sigma range ( the normal distribution range of 68% of the
population around the median). This value has no dimension, and is always
positive and never infinite or 0. Therefore the inverse 1S/S has the same
properties.
1.
S/1S<1
a ratio below 1 has a higher 1Sigma value and indicates a double distribution or
a rectangle shaped distribution (test limited distributions)
2. S/1S=1
A ratio equal to 1 is a normal distribution.
3.
S/1S>1
A
ratio above 1 has a lower 1Sigma value and indicates failing devices in a
distribution or a distribution with smaller side distributions or a
logarithmic distribution.
a)

The value S/1S provides a good profile of many test
distributions. The bars higher than 1 are distributions with failing parts and
the bars below 1 are double distributions. The double distributions (S/1S<1)
are the most unwanted types
b)

The picture 2a shows more than 150 tests of
devices. 6 S/1S's are above 10,
about 30% above 4, about 50% at 1and 5 below 1. Picture 2b shows another test.
Both pictures contain devices with severe failures (S/1S>1). In the next
picture 3 we see plots of distributions with failures
S/1S>>1.
Picture 3 device failures
Picture 3 shows one of the first tests with an extreme
failure of one device and a test with a side distribution (type 3 with
S/1S>1). Tests with a high ratio of S/1S often with just one extreme
failure seem to correlate all to each other. Their regression coefficient is
close to 1. After a cleaning often there is no correlation at all and the
regression coefficient is very low.
The tests with a S/1S ratio below 1 have double or rectangle shaped
distributions as we see in the next pictures 4.
Picture
4

The cumulative plots with S/1S>1 can be used to clean the data and to
remove these devices.
1.31 Unstable devices
Distributions also can have
failures with a continuos drop from a straight line. These devices can be
unstable and run out of the distribution or indicate a moving offset of the test
equipment. We can add to the S/1S plot a second plot from a test at a different
time of the same devices and observe the changes of their distributions.
Distributions can change by
time. Usually the activation energy and the MTF (Mean Time to Failure) are used
to describe their lifetime. But before devices fail, they can change their
distribution. The Weibull statistic helps us to describe the failure rate. It
doesn't tell anything about the problems of the device, whether it suddenly or
slowly fails and why it fails. It only tells us something about the probability
of failures over time.
One or more parameter of a
device can either drift in one direction or random in any direction. A drift in
one direction (Offset) causes a change of the Mean and a random drift changes
the Sigma (width) of the distribution. We almost can observe both. The Sigma can
decrease or increase. Sometimes we can observe a fast and a slow aging of
the same population, caused by different processes ( thermal , mechanical stress
or electro migration). A process that causes an improvement (decrease of the
Sigma) is for example a mechanical and thermal stress reduction in electronic
components by aging or annealing. Another process "Burn in" is used to
accelerate fast aging and to separate those devices with a short lifetime (
often caused by electro migration related to a weak design)
from devices with a long lifetime. These processes are used to improve
the quality. But they don’t improve the designed quality of the devices;
they try to separate the good from the bad devices. We find weak and fast aging
devices without the knowledge of their decease. It is a patch, a selection
method based on the MTF statistics and not knowing why the devices will fail.
A more successful approach is
the continuous observation of the statistics over time, to improve the design of
a device, to find and replace the weak parts, which create unstable statistics,
to improve and control the test systems and production equipment.
Picture 5 shows an example of
a S/1S plot, how the distribution of a population has changed over time.
1.32 S/1S plot two tests: 2nd after a heat treatment of the same device population
Picture 5

The
plots of one population before (blue bars) and after (violet bar) a high temperature
storage show changes of the distributions. We see reduced S/1S values of many tests
Picture 7
1.4
Interpretation of statistics
The first distribution in the S/1S plot is a test with
a shift of the mean. The Mean value of one test has shifted more than 100%. We
believe the devices are not stable!
But we see in the S/1S plot, that the distribution has changed. The original
distribution (orange color) picture 7 has changed dramatically (after a
temperature storage blue color). We see suddenly a double distribution. When we
look at the contact test (right
picture) we find the contact of about half of the devices has changed. 40% of
the devices after the heat treatment had a higher contact resistance. But the
Sigma of the 40% high resistance contacts has not changed. How can we have such
an offset in the distribution? Why
60% of the devices are stable and 40% have a very determined offset? We must
assume something was wrong with the test because of the sudden offset of the
distribution after the heat treatment.
Picture 8

In the he next plots on the right side the higher contact resistance devices are marked with a red color. In the left plot we see in other tests also a double distribution. The same devices in the left and right plot are marked with the red color. This is a demonstration, that the contact resistance test is correlated to the change to a double distribution.

In the first picture of picture 9 on the right side we
see an uncorrelated distribution plot. In the contact distribution the red
devices are grouped together and in this distribution the same devices are
randomly distributed. We know these tests are not correlated. We can look at the
overview of the distribution of all tests, to find all correlated and not
correlated tests. Here we see on the right picture several correlated tests and
on the left some uncorrelated tests as we expected. We found, that all double
distributions are very well correlated.

This indicates, why the test failed! During the second
test the Mean contact resistance did change. It might have been a change of the
test system or environment. EMI (Electro-Magnetic-Interference) interference of
a machine near the test equipment could be the problem or an operator
interference.
With other tools we probably
had found unstable devices instead of a test problem. A misleading result can be
very expensive and cost a lot of money. It might stop the production of the
devices or delete the delivery. Besides the loss of money, it can cause
confusion in the market and wrong activities in a company. Engineers also can
control with a S/1S plot very well the quality of purchased parts.Therefore a careful
evaluation of test data can save money. More important in some cases: it can
help to save lives, when carefully designed and tested devices are delivered in
a safety system of cars, airplanes and other transportation systems