
Avoiding failures
There's
more to it than just design
When OSHA was formed in the mid-1970s, one of its first targets was metal
forming machines. Prior to OSHA-mandated design changes, these machines
had a tendency to cycle without operator input. As a result there were
numerous injuries. The solution was a redesign of the control system that
would prevent inadvertent cycling of press breaks, shears, and other metal
working machinery.
At the time, I was a consulting engineer specializing in designing process
machinery and systems and I benefited from the new regulation. We designed
a control logic that absolutely prevented the problem. It consisted of
a special pneumatic control valve that would not cycle without operator
input.
Each time the operator triggered the machine, it would complete one
cycle and could not be recycled until another input was received. To further
protect the operator, we incorporated a two-hand control input system that
prevented the operator from circumventing the control logic.
The two palm-button, 2-way valves were enclosed in guards and locate
shoulder-width apart. There was no way that the operator could have his
or her hands near the work area when triggering the machine. We designed
a fail-safe solution to the problem. Or did we?
After installing a number of these systems, we visited the site to inspect
our handy work. To our amazement, the operators had found a way to circumvent
our design. They had attached drink cans to a wooden rod and were using
this device to operate the machine with one hand--one-handing the
machine. As a result, they could increase production speed, but they always
had one hand in the work area of the machine. This mode of operation nullified
a major part of the operator protection that the design was to ensure.
There are very few things that are impossible, but designing a fail-safe
machine or process system is close. As an engineer who has spent most of
his thirty-plus-year career designing machinery, I am convinced that operators
and maintenance craftsmen can circumvent any fail-safe device or design.
As an example, we were asked to investigate a catastrophic fan failure
in an integrated steel mill. The fan, installed in an electrostatic precipitator
system, had a built-in supervisory system that monitored the vibration
level generated by its rotating element. The supervisory system had the
ability to trip the fan's motor whenever the vibration level exceeded pre-set
limits.
When we arrived on-site, the impact of the fan failure was evident.
All fourteen blades, each seven feet long, had broken off of the fan's
hub. One fan blade flew through the sidewall and roof of an adjacent building.
Others were found several hundred feet from the fan location. The surprising
part was that the fan's motor was still turning.
Why did the supervisory system fail? Our analysis found that the fan
had a history of trips generated by excessive vibration. To eliminate these
trips, the maintenance crew turned-off the supervisory system.
It is not possible to build a totally fail-safe machine or process system,
but we can design one that minimizes the probability of catastrophic failure.
However, there are several factors that must exist before a design can
be considered near fail-safe.
There are very few things that are impossible,
but designing a fail-safe machine or process system is close.
The first requirement of a fail-safe system is proper installation. Each
machine has specific installation requirements that must be followed completely
to maintain its fail-safe qualities. Simple deviations, such as pipe strain
on the pump casing, eliminate its reliability and fail-safe qualities.
The second requirement is that the machine must always be operated within
its designed operating envelope. Every machine or process system is designed
to provide a specific range of functions. A part of this design is an operating
envelope that defines the specific methods that must be used to operate
the machine or system.
Normally, a specific range of incoming product variables bounds the
operating envelope, range of work performed within the machine or system,
and its final output. When this operating envelope is violated, fail-safe
elements designed into the machine are voided.
A few years ago we were asked to solve a chronic boiler-tube rupture
problem in a 1,000 megawatt, supercritical electric power generating plant.
During our initial meeting with the plant's manager we found the solution
to their problem. Early in the meeting, the manager bragged about their
ability to take the plant from cold shutdown to full on-line three times
faster that the vendors recommended ramp-rate. The thermal shock caused
by the excessive ramp-rate was the sole reason for the chronic tube problem.
If the plant had adhered to the vendor's operating envelope, the design
provided fail-safe operation over the entire design life of the boiler.
The third requirement for fail-safe systems is proper maintenance. Machines
must be maintained to assure safe operation. The machine design cannot
compensate for machines that are permitted to degrade, out-of-adjustment
conditions, or improper repairs. No matter how good the initial designs
or type of protection system, a machine that is in less than optimum operating
condition is prone to failure.
The use of supervisory systems is a good adjunct to fail-safe machine
design. These systems incorporate sensors and control logic designed to
detect deviations from normal operating dynamics. Should these deviations
exceed pre-selected limits, the supervisory system has the ability to shutdown
the system before failure can occur. Design logic for these supervisory
systems assumes that the machine will be operated within its normal operating
envelope. Should the operator violate this envelope, reset the trip limit,
or turn-off the system, the machine is no longer fail-safe.
There was no way that the operator could have
his or her hands near the work area when triggering the machine. We designed
a fail-safe solution to the problem. Or did we?
Few supervisory systems have the ability to prevent instantaneous failures.
Should the operator create a fast transient by exceeding design ramp-rate
or other practice that results in radical, instantaneous load or speed
change in the machine or system, the supervisory system cannot react fast
enough to prevent catastrophic failure. Fortunately, most failures result
from long-term changes in machine condition and can be prevented by this
type of system.
Some supervisory systems are more effective than others. For example,
the supervisory systems that are commonly installed on bullgear compressors
do not monitor all the potential failure modes. These systems have proximity
probes installed to monitor the vibration level of the pinion shafts of
the compressor's impellers.
Since these impellers are turning at speeds ranging from 12,000 to 60,000
rpm and tend to have excessive axial movement, the use of proximity probes
is questionable.
Both the speed and end play of these pinion shafts may distort the vibration
levels recorded by the supervisory system. In addition, these systems rarely
monitor the bullgear shaft or the lubrication system. Therefore, failure
of either of these two systems will not be detected in time to prevent
catastrophic failure.
With the advancements in design and machine technology, it is possible
to
design a machine that is capable of providing long-term, fail-safe
operation, but these advancements are worthless unless we install, operate,
and maintain these machines properly.
Copyright May 1998 Plant Services on the WEB
|