MOSFETs are well known to be prone to inexplicable failures [MOSFET is sometimes thought to mean ‘Magically Obliterated, Smoke and Fire Emitting Transistor’]. The truth is that MOSFETs are incredibly robust – but that they fail very fast indeed if any of their rating are exceeded. There are a few ratings which are very difficult to get sensible information on, and which can cause problems. This page is a start at trying to explain some of the main mosfet failure mechanisms and how to prevent them.
If you just want the short version of some practical steps to reduce electrical noise see here.
Mosfet failure modes
It is difficult to be certain of exactly what caused any one failure: the problem is that failures are difficult to promote in any well designed controller, and customers are not usually aware of exactly what happened to cause the failure. Furthermore once a MOSFET fails – it is now dud and will not work properly so it promptly goes into another failure mode, obliterating the original evidence. The examples here should be treated as helpful examples only – don’t assume that, because your MOSFET looks just like a particular example, then that is what caused the failure.
Here are some of the failure modes or causes that we know of
- Avalanche failure
- dV/dt failure (Motor brush noise)
- Excess power dissipation
- Excess Current
- ‘Foreign’ objects.
- Jammed (or blocked) motor
- Rapid acceleration/deceleration
- Short-circuited load
- Dud battery
If the maximum operating voltage of a MOSFET is exceeded, it goes into Avalanche breakdown. This is not necessarily destructive. The MOSFET specifications will state a maximum energy the MOSFET can take in avalanche mode. Energy is 1/2LI2 where L is the inductance and I is the current. Fortunately, in most circuits, the energy the MOSFET may have to clamp is that contained in the rather small (lumped) inductance of the battery and its leads. See the article on PWM controllers in the 4QD TEC archives.
If the energy contained in the transient over-voltage is above the rated Avalanche energy level, then the MOSFET will fail. The device fails short circuit, initially, with no externally visible signs.
The problem with this failure mode is that, once it occurs, there is likely to be a chain reaction which will probably disintegrate the MOSFET, obliterating the evidence and probably blowing other devices to boot. So it’s of vital importance to report exactly what events occurred at the point of failure.
The controllers in normal use are generally incapable of generating spikes of enough energy to blow them. So the necessary high energy spikes are usually generated by external events. These can be things such as
- Contactors or relays switching
- Fuses blowing
To prevent such failure, you need to understand not only how transients are generated, but also how they may travel from generation point to the controller. This is a complex subject, but our page on Machine wiring: Good and Bad practises should give you a head start.
This effect is probably the least understood and most mysterious of all MOSFET failures. It is also probably the biggest cause of all those otherwise inexplicable failures that let out the “Magic Smoke and Fire”! It is also one of the hardest failures to study as it is an extremely high-speed failure, so requires very expensive transient capture equipment. The good news is that, as MOSFET technology improves, it seems to be getting more rare than once it was!
It is also a failure mode which is probably more common on industrial control systems. These tend to be wired for neatness and appearance, so wires tend to be longer than is ideal and routing tends to be bad. There are also sources of noise other than the motor, such as relays and contactors. See the page on machine wiring.
The cause of this failure is a very high voltage, very fast transient spike (which may be positive or negative going). If such a spike gets onto the drain of a MOSFET, it gets coupled through the MOSFETs internal capacitance to the gate. If enough energy gets coupled, the voltage on the gate rises above the maximum allowable level – and the MOSFET dies instantaneously. The process less than a nano-second! The initial spike destroys the gate-body insulation, so that the gate is connected to the body. Once that has happened, the MOSFET explodes in a cloud of flame and black smoke. We have one documented case where the battery wire worked loose, causing a spark. It must have been this that caused the gate breakdown for the explosion of flame and smoke did not happen until the battery wire was re-connected some time later! Which demonstrates how very difficult cause and effect can be to connect!
So where can such a spike come from? Noise. Noise is generated by an arc – Marconi used an arc and a tuned circuit to first transmit a radio signal across the Atlantic ocean. Arcs are very good generators of wide-band (random) noise. Random noise from an arc has a statistical probability of containing an energy spike of just the right parameters to blow a MOSFET. Whatever you do – there is still a statistical probability, but you can reduce it to a low level.
Motor commutators and brush gear are arc generators: look at the brush gear on any motor and you will almost certainly see it arcing.
Motors are probably at their noisiest when regenerating.
However much noise the motor actually generates, for it to cause damage the wiring has to be such that it can transmit a very fast (i.e. very high frequency) transient. Properly designed wiring will not do this but bad wiring can – if you are unlucky – act as a transmission line delivering the whole energy of the transient back to the controller.
Do not over-react here. Statistics are such that a properly designed motor controller can go on working continuously for years without such a transient occurring. With production machines, the maker is using new, good condition motors. Noise is much less likely here.
But may be it’s your controller that’s next? Especially if (like many of our retail customers) you are using a second hand motor – which is much more likely to be worn and noisy. Knowledge of what happens can help you reduce the probability.
Causes and prevention of motor noise
Since dV/dt failure is generally caused by noise generated by the motor brush gear, what faults and effects in the motor cause noise, and how can you reduce the problem?
The following problems cause or make motor noise worse:
- Worn brushes and commutator
- If brush pressure is low, arcing will be greatly increased. Make sure the motor commutator and brushes are in good condition.
- Dirt, especially metallic dirt
- Dirt can get between commutator and brush, causing arcing. Metallic particles and swarf are specially are harmful as they can short out segments of the commutator causing very bad arcing.
- Blocking the motor
- A blocked motor bounces and causes the brushes to behave unpredictably. This is a common problem in fighting robots.
- Over-revving the motor.
- Commutators are mechanical switching devices. Depending on the motor’s design, there is a maximum frequency at which the design can switch. If you over-rev the motor (by applying an excess voltage to it) you can exceed the maximum switching frequency and a plasma field can be set up, short-circuiting the armature by way of an arc. This will not do the motor armature much good and can easily destroy the controller.
Prevention – good things to do are:
- Take care during assembly. Motors are magnetic and can attract swarf and metal dust. If this, or any abrasive dirt (e.g fibreglass dust) gets into the motor, it will cause wear and arcing. Seal any motor vents while assembling the machine.
- Take care of the motor and general maintenance. A well cared for and maintained system is always going to be more reliable. In particular, keep the motor and electrics clean and dry and make sure the motor brush gear and armature are not worn. Un-lubricated metal gears can wear and generate metallic dust, take steps to keep this out of the motor.
- Make sure your motor has a suppression capacitor fitted. A small ceramic capacitor 10nF / 100v should be fitted internally across the motor brushes. If your motor does not already have one, fit one externally across the motor connections as near to the motor as possible. There’s more information on this subject in the page Radio Controlled Machines: General wiring hints
- Twist the motor leads together if possible [this stops them acting like a loop aerial].
- Twist the battery leads together if possible.
- Fit ferrite rings to the motor leads.
- If your motor is subject to shock loads or fast acceleration / deceleration [such as in Robot Wars] consider fitting a fast acting varistor transient suppressor across the motor terminals.
- If your system is part of an automotive installation that includes relays, fans, or other motors that can produce noise spikes, then consider fitting a transient suppressor across the B+ / B- terminals. This Littlefuse page give more information on these.
- Keep the motor clean, dust and metallic particles around the brushes will increase wear, sparking, and electrical noise.
- If you are operating in a really electrically hostile environment then consider painting the inside of the case with nickel paint such as this from Farnell
- See this article on good wiring practise.
Here’s a picture of the protection fitted to one of our test rigs.
A typical dV/dT failure
A typical dV/dT failure is shown above. Note the black sooty deposit where the MOSFET has ‘flamed out’ in a flash of flame and sooty smoke. You can see the erupted epoxy of the MOSFET. This controller was returned to us with the statement “I had an overload on the motor”. However – it looks exactly like arc damage and it was probably caused when the motor lead was pulled off the motor terminal. There is clearly visible melting of the motor terminal at bottom right, which can only have been done by the arc as the terminal was pulled off, presumably in response to the motor getting jammed. It was this disconnection that caused the failure rather than the stalled motor.
Excess power dissipation
Exactly what happens depends on how excess the power is. It may be a sustained cooking. In this case, the MOSFET gets hot enough to literally unsolder itself. Much of the MOSFET heating at high currents is in the leads – which can quite easily unsolder themselves without the MOSFET failing! If the heat is generated in the chip, then it will get hot – but its maximum temperature is usually not silicon-restricted, but restricted by the fabrication. The silicon chip is bonded to the substrate by soft solder and it is quite easy to melt this and have it ooze between the epoxy and the metal insert of the body, forming solder droplets. The MOSFET can easily be working after this – but of course its thermal performance is shot as the soft solder bond is damaged.
Yes – if you put too much current through a MOSFET – it will fail. Exactly how it fails will depend on how high the excess current is and for how long it flows and on the exact circumstances at the time.
All controllers made by 4QD have a fast-acting current limit: this turns the speed down (or up if it’s excess regen braking current) so that the MOSFET current is always well within their safe handling ability.
Power dissipation due to current is I2R – the current times the current times the resistance. But the heat dissipated is the power times the time, so I2R.t, there t is the time.
If you slightly overload the MOSFET – it will get very hot. If you don’t remove the heat – the MOSFET will, quite literally, melt. At 60 amps, the leads on a TO220 (the commonest MOSFET housing) will literally unsolder themselves. Though the current needed for this depends on how long the leads are and how big an area of track they are soldered too. 4QD boards all have extra thick copper to act as a heatsink for the MOSFET leads.
Then if you really put too much current through, the internal bond wires (which carry current from the external leads to the chip) fuse in a flash and explode – probably forcing a chunk of epoxy into space at high speed. Cratered MOSFETs are not uncommon, but it’s difficult to tell if this is from bond wire explosion or the chip has exploded – both seem to occur pretty much in unison.
Foreign object failure
The circuit of a controller cannot take account of the effects of water, dirt, metal filing, stray nuts and bolts etc. Any such extraneous material is likely to cause malfunction and/or death of the controller.
You should therefore house the controller to prevent such occurrences. Damage caused by this is not covered under warranty.
Jammed (or blocked) motor
Blocking a motor is suddenly jamming it by means of a mechanical seizure or failure such that a rotating motor is very suddenly stopped. Robot Wars contestants will be very familiar with this.
Of course, you cannot bring a mechanical load, such as a rotating motor, to a sudden halt. Even if you throw a crowbar suddenly in the gears, much more happens than a sudden stop! There will be bounce in the system and the armature will certainly bounce. Probably the brushes will rock in their holders – there is always some clearance!
A sudden increase on the electrical load as would be caused by a straightforward, non-bouncy, seizure will simply engage the controller’s current limit. Yes – the controller will quickly get hot, but you should have time to turn down the speed.
Any failure caused by blocking is likely to come because of the armature bounce: this will (of course) be at high current and will be accompanied by arcing at the commutator, so it will generate lots of electrical noise. See dV/dt failure. Because this noise occurs at full current limit, it will likely be of high energy, so dangerous. Much depends on the motor, brush and commutator and the mechanics as well as the wiring.
Rotating weapons in Robot Wars will be particularly susceptible to this. The main aim is to have the rotor spinning as fast as possible [at full power] and then to transfer that energy instantly to the victim. To protect against the sort of transient that will be generated here it is probably worth fitting a fast acting varistor transient absorber across the motor terminals as well as the capacitors and ferrites mentioned elsewhere. Littlefuse are a typical supplier of these.
If you’ve read the typical dV/dT failure, above, you will also realise that the worst thing you can do in the event of a sudden jam is to pull off a motor lead! Turn down the speed, turn of the ignition, or if you must, disconnect the battery lead. Never disconnect the motor lead!
If you are making a machine which has mechanical travel limits – you have, of course, got electrical ‘end stops’ which slow the motor and stop it before it hits any mechanical limit…
If failure from blocking the motor occurs because of armature bounce, it must also be dangerous to apply too fast an acceleration to a motor. Any mechanical system has a response time. If you try to accelerate the system faster than this response time, you are ‘shocking’ it into a state where there may well be a ‘bounce’. This is one of the reasons why a controller always has an acceleration and deceleration ramp: for smooth take-up the power must always be applied to a mechanical system slightly slower than the system can respond. Apply power faster than the system’s response time and you are, in effect, shock exciting it! However – in most applications, the controller’s current limit will engage if the acceleration is too fast, and this will apply an effective ramp. So we’ve never seen a failure that we would care to attribute to this fault!
If the battery voltage ever falls too low, controller internal voltage may fail and the switching may get confused. Of course controllers are designed not to do this under conceivable and testable low voltage conditions.
However – batteries can sometimes fail in unpredictable ways. We have seen ones with faulty cells that go open circuit above a particular current. Of course, the current then falls to zero (as the cell is open-circuit) so the cell starts to oscillate.
This sort of unpredictable battery fault is – unpredictable. So how to predict and test that it won’t damage the controller?
So if you have problems, always get your battery properly tested at high discharge current. It should be able to supply more current than the controller’s motor current limit, without showing distress.
If the load is short-circuited, the current will rise and the current limit will engage, so immediate failure will be prevented. However – we do not guarantee the controllers are safe against short circuits, for if the short is sustained and is ‘too short’ – failure can eventually occur.
The current limit engages after about 2µSeconds. During these two microseconds, the MOSFET is switching on and ‘feeling’ the load. It is a period of extreme dissipation for the MOSFET. The MOSFET can survive this stress quite happily – but it gets extremely hot. If the load is too small, the MOSFET’s insides will get hot enough that the heat cannot get out quickly enough and the soft-solder used inside the package to bond it together will melt and ooze out between the base of the MOSFET and the insulator (you can usually see it on the insulator afterwards). The MOSFET will then fail.
The time to failure is entirely dependant on the severity of the short-circuit, but is quite long enough for a human to react (30 seconds to several minutes). However the current and voltage conditions in the MOSFET are entirely dependant on the wiring (both motor and battery) as the motor is shorted out, so the time is completely unpredictable
Other components that will be affected MOSFETs rarely fail alone. Other components that should be checked are (in decreasing order of frequncy)
- Loside 10R gate resistors. If they have blown then check other loside components:
- Gate zener
- PNP driver transistor
- NPN driver transistor
- Hiside 10R gate resistors. If they have blown then check other hiside components:
- Gate zener
- PNP driver transistor
- NPN driver transistor
This is beyond most people’s abilities but further detains are on the page NCC series MOSFET gate drive waveforms and testing.
If you’ve read all this you’ll be tempted to examine your next MOSFET failure and state, “that’s failed from excess dV/dt”. But it really isn’t that simple. Even if you were to return the damaged MOSFETs to the manufacturer, they could not examine them and state the cause of failure. When I quote a reason for failure, I’m combining several sets of information:
- What the customer has told me about the exact circumstances at and just prior to the failure.
- The physical state of the returned MOSFETs.
- What other components have gone as a result of the failure (or are they the cause)
- Our experience of having seen many years worth of failures.
As 4QD have learned more about failures, we’ve altered our circuits appropriately. As we’ve done that, the failure rate has reduced. It would be nice to think this was simple cause and effect. However it’s not at all simple to see a correlation between our changes and the failure rates. In part that’s because the customers are getting more knowledgeable about controllers, but in part also it has to be that MOSFETs themselves are getting more reliable with time, as you would expect from any high-tech product. But we are always loath to blame a dud MOSFET as the cause of any failure!