Testing integrated circuit chips at elevated temperature and voltage, known as burn-in testing, is used to improve product reliability by screening out chips that pass all their regular tests, but might fail during customer use. Burn-in of high performance chips such as microprocessors requires increasingly expensive power and temperature control to avoid destroying good chips. Burn-in test is also becoming less effective in newer technologies. The combination of these problems means that burn-in test can now cost more than regular chip testing for high-performance products. Burn-in test is already too expensive for many products, even though they have high reliability requirements.
One way to improve chip reliability without burn-in is to throw away chips which are "different" from the rest of the chips. They are different in that they work correctly, but some measurements have abnormal values. These "black sheep" chips are known to statisticians as "outliers". What makes this problem challenging is that past experience shows that most of the chips thrown away will work just fine for the customer. So throwing them away represents lost profits to the manufacturer. The research goal is to find techniques that more accurately identify outliers that would fail for the customer.
There are two research challenges. The first is to identify a set of measurements that are good at detecting outliers. The second is to determine when a measurement is so abnormal that a chip should be thrown away. Historically the standby power consumed by a chip has worked well for detecting outliers. However this is becoming less so in new technologies. New ideas are to consider other measurements, such as behavior under abnormally low operating voltages, or how different a chip is from its neighbors on the semiconductor wafer. This research will evaluate these and other ideas using chip data from IBM, LSI Logic and Texas Instruments to identify approaches that work well in practice. The primary merit of the research is that it will develop techniques to reduce chip manufacturing costs and improve the reliability of chips shipped to customers.