# White Paper Intelligent Outlier Removal: A Cost-effective Way to Improve Device Quality (and Yield) ## Introduction Outlier removal techniques such as Parts Average Testing (PAT) are growing in popularity as a means to improve the quality and reliability of semiconductors used in automotive and high-volume applications like mobile handsets. The goal of PAT is to remove "outliers", that is, otherwise goodparts that exhibit characteristics that are atypical when compared to parts from the same wafer or lot. However, there is a danger that if PAT limits are set incorrectly, the net result will be unacceptable yield-loss, and hence significantly higher device cost. This paper will describe an intelligent approach to PAT and other DPM-reduction techniques that improve device quality while minimizing unnecessary yield loss, and can even lead to reduced cost. #### What is PAT? Part Average Testing (PAT) is a test technique that has been around for more than a decade, usedprimarily by makers of automotive and medical ICs to improve the quality and reliability of devices used in air bag sensors, engine controls, pacemakers and other high reliability applications. Documented by the Automotive Electronics Council in the AEC -Q001 guidelines, PAT is based on the premise that "outliers are evil". That is, parts that have atypical characteristics tend to be higher contributors to infant mortality and long-term reliability problems- even when they pass all data sheet specs. Industry studies have shown that implementing PAT can reduce DPM (Defects Per Million) rates by as much as 30-60% <sup>1</sup>. Figure 1. Typical Gaussian distribution for a leakage test with parametric outlier circled Parts are considered outliers when some of their parametric test results are skewed significantly from other devices in the same wafer or lot, or are located in regions of the wafer that have an unusually high number of failing die (i.e. a "bad neighborhood"). PAT limits are typically applied to asubset of the parametric tests that have been shown to correlate with early failure mechanisms. These tests include but are not limited to pin leakage, standby power supply current, output breakdown voltage and DC parametrics (e.g. $I_{IH,}I_{IL,}V_{OH,}V_{OL}$ , etc.). ## **Not Just for Automotive Anymore** It is easy to see why automotive OEMs and medical systems manufacturers were early adopters of PAT, because the amount of electronics in vehicles and medical devices has skyrocketed, and the cost and inconvenience of failure can be pretty dramatic. But what about ICs used today in mobile devices like smartphones and tablets? Increasingly these devices are becoming mission critical to our daily lives. Nearly a billion smartphones and 270 million tablets were shipped in 2013, and a failure rate of just .00001% would represent more than 1 million unhappy customers. For the IC manufacturers who are designed into these products an excessive failure rate could mean the loss of the next generation socket and a huge financial impact. So it's no wonder that the industry leaders in the mobile market are starting to demand some of the same quality standards as in automotive, and IC manufacturers are now starting to deploy techniques like PAT and GDBN<sup>2</sup>. (GoodDie in a Bad Neighborhood) at their semiconductor test operations and subcontractors. Figure 2. Makers of mobile devices are now demanding increased levels of quality and reliability, which requires enhanced test techniques like PAT. #### **PAT at Wafer Sort vs. Final Test** The first question that must be answered by test and product engineers is where to adopt PAT in thetest process- at wafer sort, final test or both. PAT at wafer sort has a number of #### inherent advantages: - Outlier identification is performed after all the die on a wafer are tested, so the binningalgorithms have the benefit of all the test data for the most accurate PAT binning - Spatial techniques like GBDN<sup>2</sup> and NNR<sup>3</sup> can be employed since the die locations are known - The cost of any incremental fallout is minimal since it involves just the cost of theuncut/unpackaged die - Outlier analysis can provide valuable feedback for the wafer fabrication process - No changes to the test program are required However, PAT at final test also has some advantages: - For parts that are "blind-build" it is the only way to perform PAT - It removes outliers that are introduced by the assembly process, especially for devices that incorporate multiple die or advanced packaging. - There are often a broader range of tests to choose from when applying PAT rules Given the relative advantages of both approaches, it is not unusual for new adopters of PAT to deploy the technique at both wafer sort and final test, and closely monitor the results to determine the best long-term strategy for each class of devices. # Why Dynamic PAT (DPAT)? Some device makers have implemented a simple form of outlier removal known as static PAT, or SPAT, as a first step towards a more rigorous test process. Static PAT limits are defined based on historical data points, typically from at least 6 part lots with at least 30 parts from each lot. Limits are then periodically adjusted to adapt to shifts in the process and test setup. The strength of SPAT is that it is relatively easy to implement using home-grown tools; however, it almost always leads to excessive yield loss or missed outliers since the limits are based on samples that may not resemble the current lot of devices. Dynamic PAT limits are computed based on the current population, wafer or lot, under test. Hence, the reference population is from the same group as the part being tested and real outliers can be easily identified without excessive yield loss. For DPAT at wafer sort, the PAT limits are computed after all the dice have been tested on the wafer. Then PAT binning occurs as a post-processing step. For DPAT at final test, the limits are initially computed based on a sample of the first *n* devices and then readjusted after every *m* parts are tested. In this way the limits adapt to subtle shifts in the tester/load board/socket that would otherwise falsely incriminate good parts. DPAT limits for multi-site tests must be calculated independently to adjust for site-to-site offsets. At final test, the PAT binning is applied on-the-fly as each part is tested, so there is no need to re-sort parts after testing completes. ## **Catching Hidden Outliers with More Intelligent Algorithms** While DPAT at wafer sort has the benefit of seeing all the test results before it calculates the PAT limits, it can still miss some outliers that are hidden in the normal variability of die across the wafer. Parametric variations in different areas of the wafer are normal; for example, the die near the center of the wafer may have parametric results that are somewhat different than those on the edge. However if a die in one area of the w fer has a test result that is significantly different than its most immediate neighbors, then it is most likely indicative of a defect, and should be binned out. Fortunately, there are some sophisticated algorithms that can help identify the outliers that simple rules might miss. One such algorithm for wafer sort is called NNR, which stands for Nearest Neighbor Residual. Using NNR, the PAT limits for each die are determined based on the distribution of test values of the die surrounding it. This shrinks the context for each test to a small region of thewafer so that a local outlier would stand out. Figure 3. NNR exposes hidden outliers by comparing parametric results to other die in the same area of the wafer Another sophisticated approach that identifies a different class of hidden outliers is called "multi- variate" PAT. This technique looks for variations across groups of similar/correlated tests instead of just looking at each test individually. For example, if a part has five leakage tests that are closely correlated, and four are normal but one is on the high side, this might be considered an outlier- again because it uses contextual data to make a more accurate decision. # **PAT ROI: Turning Yield Loss into Yield Improvement** In highly competitive markets like automotive and mobile, OEMs are reluctant to pay more for devices with enhanced quality based on testing processes like PAT. So if PAT leads to an incremental yield loss of, typically between 0.1% and 1%, the IC supplier will usually have to absorb that cost. So how do you justify the incremental cost? One way is to say that it is simply the cost of doing business with high volume and/or high-quality manufacturers, and a 1% cost increase is a lot better than losing a contract. Another argument is that by reducing the number of field returns there is an associated cost savings related to reducing the amount of failure analysis required, which can be quite expensive and time consuming. However, the most enlightened users of PAT, GDBN and other similar techniques have discovered that by analyzing the outlier data and feeding back their findings to the manufacturing process they can actually improve overall yield by better centering the fab process and/or eliminating assembly related defects. So, for example, instead of seeing yield drop from 90% to 89% after adding PAT binning, they may see yields improve to 92% or more after incorporating PAT-driven process improvements. **Analog Devices** ## Galaxy PAT-Man: Intelligent Outlier Removal Galaxy is an industry leader in the area of intelligent outlier detection and removal with more than a dozen deployments of its PAT-Man™ solution at major semiconductor companies around the world. PAT-Man offers the only commercial solution in the industry that supports PAT at both wafer sort and final test in a comprehensive automated system. The "intelligence" of the Galaxy solution helps minimize excessive yield-loss in the outlier removal process, and provides essential data to product engineers to help them both minimize the occurrence of outliers and improve overall yield. Galaxy's intelligent outlier detection solution includes a number of industry firsts and advanced features including: - First auto-adaptive algorithms that adjust for Gaussian and non-Gaussian distributions on-the-fly - First multi-variate PAT rules that can automatically identify groups of correlated tests and catch outliers that are missed by other algorithms - First site-specific limits at probe and automatic limit tuning at final test to account fortester/socket/prober variation and drifts - Compound recipes that combine parametric and geographic rules such as DPAT, GDBN,NNR, and maverick wafer/lot detection like SYA<sup>3</sup> - Yield alarms that alert operators when outlier detection exceeds predicted levels - Extensive outlier reports that help product engineers improve overall yield by feeding backoutlier analysis to the manufacturing process Based on a proven architecture and intuitive user interface, PAT-Man™ integrates easily with your existing production test environment and provides the fastest and most cost-effective solution for DPM reduction in the industry. Galaxy engineers can help you integrate PAT-Man with your MES (Manufacturing Execution System) to create a fully automated system for outlier detection and removal. Galaxy's comprehensive solution is ideal for companies who are implementing PAT for the first time, as well as those that need a cost-effective alternative to in-house tools. ## Summary In summary, the automotive and medical industries have done a lot to show the benefits of outlier removal in improving IC quality and reliability. Other sectors of the semiconductor industry including mobile and high-volume consumer devices are now following suit as quality becomes a key competitive criterion. Galaxy offers both the platform and the expertise to help you implement intelligent outlier removal in your test process today while minimizing yield loss. Learn why semiconductor companies like Analog Devices, Allegro Microsystems and many others rely on Galaxy to help meet the electronics industry quality/reliability mandate. ### **Notes** <sup>1</sup>.R. Madge, M. Rehani, K. Cota, and W.R. Daasch. (LSI Logic) "Statistical Post-Processing at Wafer Sort – An Alternative to Burn-In and a Manufacturable Solution to Test Limit Setting for Sub-MicronTechnologies." Proceedings IEEE VLSI Test Symposium (VTS); pp. 69–74, May 2002 - <sup>2.</sup> GDBN: Good Die in a Bad Neighborhood is an outlier removal technique used at wafer sort to identify die that pass all tests but are located in an area of the wafer where most die have failing bincodes. - <sup>3.</sup> NNR: Nearest Neighbor Residual is a technique often used with IDDQ tests in which the PAT limitsfor each die are determined based on the distribution of test values of the die surrounding it. This shrinks the context for each test to a small region of the wafer so that a local outlier would stand out. - <sup>4</sup> Multi-variate PAT: A statistical technique that looks for variations across groups of similar/correlated tests instead of just looking at each test individually.