Why Do Mosfets Fail?

Why do MOSFETs fail? Even with the best design, the best components, and a new motor this can occur – often for seemingly inexplicable reasons.  Indeed the term MOSFET came to stand for ‘Magically Obliterated, Smoke and Fire Emitting Transistor’]. The truth is that MOSFETs are incredibly robust – but that they can fail very fast indeed if their ratings are exceeded. This page is a start at trying to explain some of the main mosfet failure mechanisms and how to prevent them.

But the bottom line is that to give your controller a long and happy life you need to take electrical noise suppression seriously, fortunately it’s not rocket science, and here is a link to some simple practical steps to reduce electrical noise and protect your controller.

Mosfet failure modes

It can be difficult to be certain of exactly what caused any one failure: the problem is that failures are difficult to promote in any well designed controller, and users are not often aware of exactly what happened to cause the failure. Furthermore once a MOSFET fails – it is now dud and will not work properly so it promptly goes into another failure mode, obliterating the original evidence. The examples here should be treated as helpful examples only – don’t assume that, because your MOSFET looks just like a particular example, then that is what caused the failure.

Here are some of the failure modes or causes that we know of

  • Avalanche failure
  • dV/dt failure (Motor brush noise)
  • Excess power dissipation
  • Excess Current
  • ‘Foreign’ objects.
  • Jammed (or blocked) motor
  • Rapid acceleration/deceleration
  • Short-circuited load
  • Defective battery

Avalanche failure

If the maximum operating voltage of a MOSFET is exceeded, it goes into Avalanche breakdown. This is not necessarily destructive. The MOSFET specifications will state a maximum energy the MOSFET can take in avalanche mode. Energy is 1/2LI2 where L is the inductance and I is the current. Fortunately, in most circuits, the energy the MOSFET may have to clamp is that contained in the rather small (lumped) inductance of the battery and its leads. See the article on PWM controllers in the 4QD TEC archives.

If the energy contained in the transient over-voltage is above the rated Avalanche energy level, then the MOSFET will fail. The device fails short circuit, initially, with no externally visible signs. The problem with this failure mode is that, once it occurs, there is likely to be a chain reaction which will probably disintegrate the MOSFET, obliterating the evidence and probably blowing other devices to boot. So it’s of vital importance to report exactly what events occurred at the point of failure.

The controllers in normal use are generally incapable of generating spikes of enough energy to blow them. So the necessary high energy spikes are usually generated by external events such as;

  • Contactors or relays switching, if your system is part of an automotive installation that includes relays, fans, or other motors that can produce noise spikes, then consider fitting a transient suppressor across the B+ / B- terminals. This Littlefuse page give more information on these.
  • Fuses blowing
  • Inductive car horns [Model loco or car builders – be sure to fit catching diodes!]

To prevent such failure, you need to understand not only how transients are generated, but also how they may travel from generation point to the controller. This is a complex subject, but our page on Machine wiring: Good and Bad practises should give you a head start.

One other cause of avalanche failure can be excessive regen braking. If a controller is being used near the top of its voltage range, with a fully charged battery, with the deceleration ramp set to minimum, and with a heavy load, it is possible for the regen voltage to exceed the point at which avalanche breakdown occurs.


dV/dt failure

This effect is probably the least understood and most mysterious of all MOSFET failures. It is also probably the biggest cause of all those otherwise inexplicable failures. It is one of the hardest failures to study as it is a high-speed failure, so requires expensive transient capture equipment. The good news is that, as MOSFET technology improves, it seems to be getting more rare.

It is also a failure mode which is more common on industrial control systems. These tend to be wired for neatness and appearance, so wires tend to be longer than is ideal and routing tends to be bad. There are also sources of noise other than the motor, such as relays and contactors. See the page on machine wiring.

The cause of this failure is a very high voltage, very fast transient spike (positive or negative). If such a spike gets onto the drain of a MOSFET, it gets coupled through the MOSFETs internal capacitance to the gate. If enough energy gets coupled, the voltage on the gate rises above the maximum allowable level – and the MOSFET dies instantaneously. The process takes less than a nano-second! The initial spike destroys the gate-body insulation, so that the gate is connected to the body. Once that has happened, the MOSFET explodes in a cloud of flame and black smoke. We have one documented case where the battery wire worked loose, causing a spark. It must have been this that caused the gate breakdown for the explosion of flame and smoke did not happen until the battery wire was re-connected some time later! Which demonstrates how very difficult cause and effect can be to connect!

So where can such a spike come from? Noise. Noise is generated by an arc – Marconi used an arc and a tuned circuit to transmit a radio signal across the Atlantic. Arcs are very good generators of wide-band electrical noise. The noise from an arc has a statistical probability of containing an energy spike of just the right parameters to blow a MOSFET. Whatever you do – there is still a statistical probability, but you can reduce it to a low level.

Motor commutators and brushes are arc generators: look at the brush gear on any motor and you will almost certainly see it arcing. Motors are at their noisiest when regenerating as the resulting voltage can be significantly higher than the supply.

For the noise to cause damage the wiring has to be such that a fast, high frequency transient can get coupled back into the controller.See the section below for steps on how to prevent this.

Do not over-react here. Statistics are such that a properly designed motor controller can go on working continuously for years without such a transient occurring. With production machines, the maker is using new, good condition motors. Noise is much less likely here. But may be it’s your controller that’s next? Especially if you are using a second hand motor which is much more likely to be worn and noisy. Knowledge of what happens can help you reduce the probability.


Causes and prevention of motor noise

Since dV/dt failure is generally caused by noise generated by the motor brush gear, we need to take a 3 pronged approach;

  1.  Stop the noise being produced in the first place.
  2. Suppress any noise that is generated.
  3. Stop anything that gets by 1 & 2 from getting in to the controller.

1. Stopping the noise

Take care of the motor and general maintenance.

Keep it clean and dry
Make sure the motor brushes and armature are not worn [worn brushes have a lower spring pressure which causes a higher resistance].
Keep metallic dust and swarf out.
Don’t over rev the motor [you can exceed the maximum switching frequency and a plasma field can be set up, short-circuiting the armature by way of an arc].
Don’t suddenly stop the motor, the armature of a blocked motor can bounce and oscillate which causes the brushes to behave unpredictably. This is a common problem in fighting robots.
2. Suppressing noise
  • Make sure your motor has a suppression capacitor fitted. A small ceramic capacitor 10nF / 100v should be fitted across the motor brushes. If your motor does not already have one, fit one externally across the motor connections as near to the motor as possible. There’s more information on this subject in the page Radio Controlled Machines: General wiring hints
  • If your motor is subject to shock loads or fast acceleration / deceleration [such as in Robot Wars] consider fitting a fast acting varistor transient suppressor across the motor terminals.

3. Stopping the noise getting in to the controller

Here’s a picture of the protection fitted to one of our test rigs.

motor noise suppression

 

 

 

 

 

 

 

 

 

A typical dV/dT failure

why do mosfets fail

 

 

 

A typical dV/dT failure is shown above. Note the black sooty deposit where the MOSFET has ‘flamed out’ in a flash of flame and sooty smoke. You can see the erupted epoxy of the MOSFET. This controller was returned to us with the statement “I had an overload on the motor”. However – it looks exactly like arc damage and it was probably caused when the motor lead was pulled off the motor terminal. There is clearly visible melting of the motor terminal at bottom right, which can only have been done by the arc as the terminal was pulled off, presumably in response to the motor getting jammed. It was this disconnection that caused the failure rather than the stalled motor.


Excess power dissipation

Exactly what happens depends on how excess the power is. It may be a sustained cooking. In this case, the MOSFET gets hot enough to literally unsolder itself. Much of the MOSFET heating at high currents is in the leads – which can quite easily unsolder themselves without the MOSFET failing! If the heat is generated in the chip, then it will get hot – but its maximum temperature is usually not silicon-restricted, but restricted by the fabrication. The silicon chip is bonded to the substrate by soft solder and it is quite easy to melt this and have it ooze between the epoxy and the metal insert of the body, forming solder droplets. The MOSFET can easily be working after this – but of course its thermal performance is shot as the soft solder bond is damaged.


Excess Current

Yes – if you put too much current through a MOSFET – it will fail. Exactly how it fails will depend on how high the excess current is and for how long it flows and on the exact circumstances at the time.

All controllers made by 4QD have a fast-acting current limit: this turns the speed down (or up if it’s excess regen braking current) so that the MOSFET current is always well within their safe handling ability.

Power dissipation due to current is I2R – the current squared times the resistance. But the heat dissipated is the power times the time, so I2R.t, there t is the time.

If you slightly overload the MOSFET – it will get very hot. If you don’t remove the heat – the MOSFET will, quite literally, melt. At 60 amps, the leads on a TO220 (the commonest MOSFET housing) will literally unsolder themselves. Though the current needed for this depends on how long the leads are and how big an area of track they are soldered too. 4QD boards all have extra thick copper to act as a heatsink for the MOSFET leads.

At really high currents the internal bond wires (which carry current from the external leads to the chip) fuse in a flash and explode – probably forcing a chunk of epoxy into space at high speed. Cratered MOSFETs are not uncommon, but it’s difficult to tell if this is from bond wire explosion or the chip has exploded – both seem to occur pretty much in unison.


Foreign object damage

Unfortunately FOD isn’t just limited to jet engines. Nuts, bolts, washers, swarf, and even spanners have all been known to contribute  to the death of a controller.


Jammed (or blocked) motor

Blocking a motor is suddenly jamming it by means of a mechanical seizure or failure such that a rotating motor is very suddenly stopped. Robot Wars contestants will be very familiar with this. If you try to bring a mechanical load such as a rotating motor to a sudden halt much more happens than a sudden stop! There will be bounce in the system and the armature will probably oscillate, and the brushes may rock in their holders.

A sudden increase on the electrical load as would be caused by a straightforward, non-bouncy, seizure will simply engage the controller’s current limit. Yes – the controller will quickly get hot, but you should have time to turn down the speed.

Any failure caused by blocking is likely to come because of the armature bounce: if this is at high current it will be accompanied by arcing at the commutator, and lots of electrical noise. See dV/dt failure. Because this noise occurs at full current limit, it will likely be of high energy, so dangerous. Much depends on the motor, brush and commutator and the mechanics as well as the wiring.

Rotating weapons in Robot Wars will be particularly susceptible to this. The main aim is to have the rotor spinning as fast as possible and then to transfer that energy instantly to the victim. To protect against the sort of transient that will be generated here it is probably worth fitting a fast acting varistor transient absorber across the motor terminals as well as the capacitors and ferrites mentioned elsewhere. Littlefuse are a typical supplier of these.

If you’ve read the typical dV/dT failure, above, you will also realise that the worst thing you can do in the event of a sudden jam is to pull off a motor lead! Turn down the speed, turn of the ignition, or if you must, disconnect the battery lead. Never disconnect the motor lead!

If you are making a machine which has mechanical travel limits – you should fit limit switches which slow the motor and stop it before it hits any mechanical limit.

 

Rapid acceleration/deceleration

If failure from blocking the motor occurs because of armature bounce, it must also be dangerous to apply too fast an acceleration to a motor. Any mechanical system has a response time. If you try to accelerate the system faster than this response time, you are ‘shocking’ it into a state where there may well be a ‘bounce’. This is one of the reasons why a controller always has an acceleration and deceleration ramp: for smooth take-up the power must always be applied to a mechanical system slightly slower than the system can respond. Apply power faster than the system’s response time and you are, in effect, shock exciting it! However – in most applications, the controller’s current limit will engage if the acceleration is too fast, and this will apply an effective ramp. So we’ve never seen a failure that we would care to attribute to this fault!

 

Defective battery

If the battery voltage falls too low, the controllers internal power supply may fail and the switching can get confused. Of course controllers are designed not to do this under conceivable and testable low voltage conditions.

However – batteries can sometimes fail in unpredictable ways. We have seen ones with faulty cells that go open circuit above a particular current. The current then falls to zero (as the cell is open-circuit) so the cell starts to oscillate.

This sort of unpredictable battery fault is – unpredictable. So how to predict and test that it won’t damage the controller?

So if you have problems, always get your battery properly tested at high discharge current. It should be able to supply more current than the controller’s motor current limit, without showing distress.


Short-circuited load

If the load is short-circuited, the current will rise and the current limit will engage, so immediate failure will be prevented. However – we do not guarantee the controllers are safe against short circuits, for if the short is sustained and is ‘too short’ – failure can eventually occur.

The current limit engages after about 2µSeconds. During these two microseconds, the MOSFET is switching on and ‘feeling’ the load. It is a period of extreme dissipation for the MOSFET. The MOSFET can survive this stress quite happily – but it gets extremely hot. If the load is too small, the MOSFET’s insides will get hot enough that the heat cannot get out quickly enough and the soft-solder used inside the package to bond it together will melt and ooze out between the base of the MOSFET and the insulator (you can usually see it on the insulator afterwards). The MOSFET will then fail.

The time to failure is entirely dependant on the severity of the short-circuit, but is quite long enough for a human to react (30 seconds to several minutes). However the current and voltage conditions in the MOSFET are entirely dependant on the wiring (both motor and battery) as the motor is shorted out, so the time is completely unpredictable


 

Other components that will be affected MOSFETs rarely fail alone. Other components that should be checked are (in decreasing order of frequncy)

  • Loside 10R gate resistors. If they have blown then check other loside components:
    • Gate zener
    • PNP driver transistor
    • NPN driver transistor
  • Hiside 10R gate resistors. If they have blown then check other hiside components:
    • Gate zener
    • PNP driver transistor
    • NPN driver transistor

This is beyond most people’s abilities but further detains are on the page NCC series MOSFET gate drive waveforms and testing.


Afterword

If you’ve read all this you’ll be tempted to examine your next MOSFET failure and state, “that’s failed from excess dV/dt”. But it really isn’t that simple. Even if you were to return the damaged MOSFETs to the manufacturer, they could not examine them and state the cause of failure. When I quote a reason for failure, I’m combining several sets of information:

  • What the customer has told me about the exact circumstances at and just prior to the failure.
  • The physical state of the returned MOSFETs.
  • What other components have gone as a result of the failure (or are they the cause)
  • Our experience of having seen many years worth of failures.

As 4QD have learned more about failures, we’ve altered our circuits appropriately. As we’ve done that, the failure rate has reduced. It would be nice to think this was simple cause and effect. However it’s not at all simple to see a correlation between our changes and the failure rates. In part that’s because the customers are getting more knowledgeable about controllers, but in part also it has to be that MOSFETs themselves are getting more reliable with time, as you would  expect from any high-tech product. But we are always loath to blame a dud MOSFET as the cause of any failure!

If you have found this article useful please share it to help others discover it