There was a common phenomenon existing in products technical condition in some corporations: all developers condemned the purchase and process department that being in lax control of components, leading to a low rate of qualified of circuit boards and, what’s worse, products frequently break down at customers ’ sites. In addition, they adduced many literature examples and experts ’ speeches to support their assertions, and they wished I could say something relevant so that to give some pressure to the material and production department, however I disappointed them at last.
Three conclusions I have made are all imputing to the research and development, and some concluding viewpoints were given to research and development fellows:
1. In a company, the research and development team is strong enough so that there is no need for me to add the last straw that broke the camel;
2. The product reliability level is inversely proportional to the degree of how strong the research and development is;
3. The incorrect design of circuit boards and inappropriate use of devices account for 80% of failure factors.
Five basic examples to explain the last one:
1) An electrolytic capacitor is close soldered to the cooling plate, so the circuit parameters of the part associated with electrolytic capacitors that seem to be ‘drifting’ easily, this also caused unstable machine parameters.
2) The tone of green LBD varies, so it doesn’t look beautiful. In fact, every luminotron has wavelength requirements, even though for the green light, subtle differences in wavelength can cause chromatic aberration, while design files don’t regulate luminotron the wavelengths.
3) When a part of circuit doesn’t function normally, so just to replace an inductance of the PCB board signal line with magnetic bead. After that, it turns to BOM and circuit boards with a magnetic bead lying on which are massively produced. Actually, magnetic bead shows an electric resistance characteristic that varies based on the frequency, which is expendable when an inductance has an energy storage characteristic that is, the peak load shifting to valley storage. Additionally, even from the practical results, it seems to be no problem after changing components, but it doesn’t straighten out the real mechanism of the device.
4) Elimination of heat: it seems that the heat design is only relevant to the inside temperature of machine, however a fatal problem is ignored, which is temperature coefficient. Even if the temperature is not high and close enough to the burning point, whether the temperature increases will result in drift, and whether parameter values will push the characteristic parameters of the device to the normal edge of circuit function after the temperature drifting?
5) Derating: almost all engineers have said: “we did derating, basically it dropped by 50%, and the remaining amount is sufficient, there is definitely no problem.” However, when doing the derating, are all the parameters that should be derated dropped into the safe range? And when changing into different forms of packaging or manufacturing process, for devices with the same kind of function, can the same derating coefficient fall out of the same effect here? And for devices in specific location and special circuit, can it make clear which specific parameters can be dropped more?
There are also problems such as electromagnetic compatible, vibration, and maintainability, test, and etc. In order to change this situation, the training course ‘Reliability of Circuit and Selection of Components’ has been developed specifically to provide specific way of thinking and tangible knowledge and skills for first-line product development engineers, quality engineers, technical managers and test engineers.
What’s the relationship between the way of thinking and knowledge and our actual work? Let me explain to you one by one.
Principles of electronic reliability design include: RAMS definition and evaluation indicators, reliability model of electronic equipment, and factors affecting the rate of failure of the system, electronic product reliability indicators, determination of working environment conditions, system design and micro-design, process review and test, and design specifications and technical standards.
Generally speaking, the designing principles are difficult to establish connections with actual design directly. This paragraph mainly concerns about theories on technology. You all know Qian xuesen, right? What were his advantages, electronic, mechanical, or software, testing, or management? No! They were systemmatic method theory and engineering calculations. When we need to determine the selection of a circuit component, if there is a basic formula to directly tell us should pay attention to which indications, will component selection and circuit design still be difficult?
For example, a socket cable that is required to pass current in 10A, so is it better to use two 8A parallel wires to shunt, or to use a cable that is capable to pass the current in 14A? It can easily get the answer via reliability model. Driving a luminotron, is it better to use dynatron, or to use operational amplifier?
I went to Qingdao some time ago and visited the Museum of Tsingtao beer. I found that Germany's motor and Japan's fans produced a century ago, and they are still working. This is amazing. Factors affecting the rate of failure of the system can tell you the answer to this situation. Nowadays, neither Germany, Japan nor us can make motors and fans that could run a century.
If you want to improve the electronic reliability, what specific issues should you start with? These are problems can be solved by theoretical methods and engineering calculations. Mr. Qian passed away, but his wisdom and thought must be passed on. What I can do is to spread Mr. Qian’s ideas, wishing more people get involved and to be more widely understood and applied.
The circuit reliability design specifications include derating design (derating parameters and derating factors), thermal design (thermal design calculation, thermal design test and heat component selection), and circuit security design specification, EMC design, PCB design (layout and routing, grounding, impedance matching, and processing), usability design (usability factors, user operation analysis, and design guidelines), and maintainability design (maintainability rating, evaluation and design methods).
The core idea of circuit reliability design specifications is the monitoring processes rather than the monitoring results, for the most common example, design specification is the maintenance of pregnancy, to guarantee the prepotency. These are the conclusions made by previous experience, if you follow these design methods, then the reliability risks will be excluded.
Let’s take the thermal design as an example, you don’t need to worry about the elimination of heat is not enough if the heat output method is determined via the calculation of thermal power density and heat flow density; so long as there is sufficient remaining amount, you don’t need to worry about ‘A blind man riding a blind horse, midnight comes to the deep pool’ if you choose the fan and heat sink in accordance with the calculation of the thermal resistance and junction temperature.
PCB grounding seems to be the easiest as well as the most complicated problem, whether is there any grounding thought that is widely approved and only makes us happy without any worrying? The answer is “Yes”.
Usability seems to have little influence on us, just like we are going to have an interview, the key factors to success seem to be the academic certificates, work experience, and etc., will a piece of leek leaf on the front teeth lead to failure? The color, size, and the feel and strength required when pressing, the shape, and the layout of buttons, and displayed contents, methods, angle, and size, what’s the difference between a piece of leek leaf on the front teeth and all of the above? For users, the most common saying is “interface is system”. Users don’t know the advanced theory and internal structure, they only care about that the internal stuff is as long as well used, and the rest they care about is just the appearance. Especially for new users, appearance is the primary factor for determining to purchase or not. You know, in university, beautiful girls always have more pursuers.
Maintainability directly determines the cost. Maintainability can be divided into three levels: site level, office level, and headquarter level. Value of maintenance tools, quantity of supporting tools, and level of maintenance personnel, number of maintenance personnel, and sufficient degree of accessories are all different from each levels. Let’s think, a maintenance level that is defined as site level, has a cover that can only be moved by three people, then how many maintenance staffs are required to go on this business trip? In addition, a maintenance level that is defined as office level, requires spectrum analyzers, logic analyzers, oscilloscopes and other high-end instruments to repair, then what will the cost be? Let alone the other facilities and equipment.
Reliability testing includes standards compliance test, edge extreme conditions test, and fault-tolerance test, HALT test, destructive test, concealed condition test, and interface condition test. After communicating with many technicians, they all want to do a good job in reliability design, but generally reflect two problems: the first one is the lack of experience, and the other is that they can’t find any problems at home, but at site there will be problems. Lacking of experience problems can be solved by the methods from the second part, and the methods to solve testing problems can be found in this section. The core point of testing is the design of test case, which concentrates in two parts: one is trying to impersonate the worst conditions of users’ sites as much as possible; and the other is to target at the possible failure mechanism, to increase destructive factors intentionally to stimulate problems, so that to find the weak points and do improvement. But it should be noted that to some extent many tests are disruptive, which needs to be analyzed, and machines having a destructive test can never leave the factory.
Components selection includes basic principles, classification of components series, characteristics, indicators, and reliability application safety notice, and etc. Components include: capacitors, resistors, diode triode, connectors, crystal oscillator, electronically controlled optical device (coupling, LED), AD/DA and op-amp, electro-mechanical devices and energy conversion devices (switching power supplies, power conversion IC, transformers), digital IC, protective devices (fuses, magnetic ring and magnetic beads, and voltage dependent resistor, TVS tubes), and power modules, and so on.
A slogan is popular in girls:＂marrying a prosperous man is more practical than working hard”, although the positive and negative sides are debating fiercely online, there is one truth that no one can deny, in the end the ration of girls who married well is much higher than girls who do good. Doing a good job equals designing a good circuit, and marrying well means well selection of components. For capacitors as an example, the difference between tantalum electrolysis and aluminum electrolysis, the difference between electrolysis and ceramic chip, and the difference between wire-wound resistor and film resistor, which indicators needs more attention of digital IC, what indexes are for selecting protective devices, as we all know that, if bodyguard and security deterioration can be a big trouble.
The manufacturing process and characteristics leaded from the process are we need to understand and try to avoid in applications, such as inductance value of wire-wound resistor is large, as well as magnitude of leakage current of paper capacitor, and ceramic capacitor resistance to temperature change rate and resistance to vibration level are low, TVS resistance to surge current is small but its responding time is short, and the effect of magnetic ring depends on material and equipment, and vibration resistance, and so on.
Components failure mechanism and analysis methods include common failure mechanism, analysis methods and tools. All the contents above are about how to prevent circuit not to work properly and devices to break, but no one is wise at all times, once it is broken, we should feel like getting a valuable treasure rather than give a wide berth. Drivers all know that where is the best place to practice driving skills, on the high way? No, it’s the city where the traffic is not that good. The development of society is a process to find problems and solve problems. Problems are not terrible, but it is awful to let the same type of problems occur over and over again.
Components failure mechanism analysis is based on a basic improvement method, “based on failure mechanism precaution”. Once a problem is detected, to avoid factors that can cause problems, and then forming a specification, which every one is going to follow when do deign, there will be no more problems.
For example, the protection of ESD, many companies are doing it, methods include humidifying, which may cause the problem of MSD: after wave soldering and reflow soldering you can find some pins of some components are open to VCC and GND via the I/V curve test, at this period MSD problem should be considered. The solving method is to heat several hours before soldering to let the moisture out. Or one device is burnt out, we need to check and test which pin is broken, and the phenomenon. By using multimeter, I/V curve diagram instrument, oscilloscope, and X ray to find out failure mechanism, and then tracking down the problem by following clues, to get to the circuit linked to the pin, to analyze the circuit and manufacturing process inside factory, so we can find out the point leading to the failure mechanism and make improvement.
It’s very easy to improve reliability design micro-management method, including three parts: software, AAR, and checklist. Logically, technical content should not be mixed with management, but management can contribute to technology. For example, in company there is someone mastering one knowledge point, but no one knows it, then management measures can stimulate him to say it out and help others with practice, which is equivalent to a non-technical solution solve a technical problem.
This part aims at eradicating the impediment when carrying out the reliability works. Impediment one is people may easily become lazy. People may find that the cost of having the guidance documents could be huge, therefore they plunged into the designing work with their blind faith. One of the roles for using designing softwares is to lower the technical communication barriers. Impediment two is lack of professional experience and expertise. We should have After Action Review activities, and summarize the cause, phenomena, methods and etc, and we could share these information though the software. As we are always working hard and improving ourselves, there is no need to be afraid of lack of experience. Fast growing can also be considered as the means of solving technical problems. Impediment three is a certain person could always be narrow-minded, therefore, it is reasonable to have the designing work cross-checked and let the missed information exposed. Checklist is relatively systematic, for it serves as a good tool for designers to self-check as well as reference for other experts to find out the missing information. It could be a good learning material for beginners and useful advanced designers for reference.
—NexPCB translate from the Internet blog
Posted by NexPCB United
Anything that was written as a group effort is added here. One for all, all for one!