Nigel's Eyes

20211227 The other 9/11: lessons in compliance and risk management

This article was first published by Complinet / Thompson Reuters on 29 April 2012.

Shortly after the beginning of the working day on September 11 an aircraft crashed, killing all on board. But this was not September 11, 2001; it was 10 years earlier and arguably should have had a much greater effect on compliance and risk management than the events of 2001. This and other cases demonstrate the often under-emphasised difference between external (i.e., regulatory or legal) compliance and internal (i.e., policies and procedures) compliance.

Continental Express Flight 2574

Britt Airways/Continental Express flight (operating as Jetlink 2574) crashed when the Embraer EMB 120 Brasilia, registered N33701 broke up over Eagle Lake, Texas, en route from Laredo to Houston, both in Texas. The aircraft suffered an “in-flight structural breakup” on its second flight that day. For the short flight, all was normal, all radio communication with the various air traffic controls (ATC) were clear and precise. The aircraft went exactly where ATC told it to go. Then, suddenly, it was no longer on the radar and radio contact was lost.

While setting up for approach, the pilots had joked that they were able to arrive early by adopting a steeper and more rapid approach than usual. They would, the co-pilot said, be “pushing this descent, making like the space shuttle”. The official report found, however, that the pilots did not get close to the aircraft’s operating envelope (a term taken from early test-pilots who talked of “pushing the envelope”).

One minute later, the cockpit voice recorder picked up a “sound similar to objects flying about the cockpit” and during the last two minutes of flight, the recorder picked up no vocal sounds except one which, according to the official report into the crash, was “comparable to a human grunt”. All other sounds were automated systems warnings. Black box data showed a rapid and steep descent with negative G (i.e., less than standard gravity) of 3.5 and after the initial fall, an average of 3.375. The aircraft pitched, rolled and yawed violently, had sudden changes in direction and the engines behaved beyond expectations.

The U.S. National Transportation Safety Board (NTSB) concluded that the reason there were no other vocal sounds was that whatever had caused the event was so severe and sudden that everyone on board had probably blacked out. In short, for that last two minutes, the aircraft was flying itself — or not. Both flight crew were experienced. The captain had more than 2,400 hours on the EMB120. The first officer, despite being significantly older, had over 1,000 hours on the EMB120 and more than 11,000 hours in total. He was, in fact, a qualified captain. Nothing in the flight recorders suggested pilot error. Everything pointed to a catastrophic event entirely unrelated to anything that happened in the cockpit.

Eye witnesses said that the aircraft was flying normally until it was “suddenly consumed by a fireball”, from which the wing-tips and part of the tail protruded, the engines revving, sputtering “followed by three pops”, a flat spin to the left, the right wing missing, the left wing dangling and more. The impact was severe and the NTSB classified the event as “not survivable”.

Initial questions arose as to whether the may have been an explosion, perhaps terrorist-related. This was, after all, less than three years after the bombing of Pan Am flight 103 over Scotland, and inevitably, this was considered. The truth, once it was discovered, was much more prosaic and much more disturbing in many ways. The loss of the aircraft and 14 lives was due to simple failures in compliance and risk management systems and controls and, ironically, was the result of one person deciding to help his colleagues out because they were under pressure and he had some time on his hands.
The EMB120 has a tailplane shaped like a letter T with the rear wings (called horizontal stabilisers) at the top of the T. That part of the aircraft was found some 650 feet from the main wreckage. To protect the leading edge of the wing and to provide de-icing airflow during flight, there is a “boot”, a strip of protective material, on the front edge of the stabilisers. These are low-tech components that must be replaced as part of scheduled maintenance due to wear. They are held on by screws. When the tail was found, investigators noticed that the left side boot was missing and that the locating holes under the stabiliser into which the boot was screwed were “stressed”. They did not find similar stresses in the locating holes on the top of the stabiliser.

The story was astonishingly simple. During routine maintenance the evening shift was slightly behind with its work. Waiting for them to complete their work was an inspector. As the maintenance crew began to unscrew the boot from the right stabiliser, he hopped onto the people lift and climbed onto the top of the right stabiliser, where he unscrewed the top of the boot, carefully putting all the screws into a plastic bag and placing them on the lift. Then he walked over to the left wing and took out the top screws from there, too. Once again he put the screws into a clear plastic bag and placed them on the lift.

As the evening shift removed the boot from the right side, the night shift arrived. There was a handover during which the incoming shift supervisor asked the outgoing shift supervisor what work had been done and what remained to be done. No one asked the inspector to confirm the status at handover. He did complete a form, but with insufficient clarity: he simply wrote “helped the mechanics remove the de-ice boots”.

The new shift decided however that there was insufficient time to replace both boots and that they would return the left boot to the stores and bring the aircraft back to fit the boot on another occasion. That, of itself, was not a cause for criticism.

The causes of the accident were:

An inspector performed the work of a mechanic, which he was not supposed to do, even to help. This raises questions of conflict of interest: how can an inspector inspect if he, himself, has been a party to the work?

Relevance to compliance/risk management: This goes to the heart of division of responsibility, and even to the question of the involvement of internal audit and whether they should have an advisory/consultancy role or to be entirely independent.

The inspector’s report was unclear, although with the benefit of hindsight the use of “boots” (plural) may be an indicator.

Relevance to compliance/risk management: Precision in language in all reports, including but not limited to suspicious activity reports (SARs) (both internal and external) is important.

That there were screws left over after the right boot had been refitted. No one questioned their origin. The boot is held on with almost 50 screws and so a bag with a large number of screws should have been a significant warning sign.

Relevance to compliance/risk management: Anomalies are important. They are the essence of suspicion and therefore the basis of SARs, and of protecting the organisation. Information on anomalies must be collected immediately, and analysed to gain a proper picture which is as contemporaneous with events as is possible.

That documentation was not completed in a timely fashion. Even though documentation was completed it was after the night shift had taken over and received its verbal handover instructions.

Continental Express designated certain critical components as relevant information and intelligence (RII), meaning that they must be subject to a higher level of inspection after maintenance because of the potentially serious effects in the event of a failure. The worksheets were marked to show that the boots were an RII item and Embraer manuals defined the whole tail assembly as an RII item. The night shift inspector, however, (not the one who had unscrewed the top of the left boot assembly) said that he “knew” that the boot per se was not an RII item and that he, therefore, did a standard, not an enhanced, inspection.

Indeed, the NTSB report said that he “therefore conducted only a cursory walk around the tail without inspecting the final installation of the leading edge/de-ice boot”. The NTSB does not, however, make the legitimate point that, by that time, all those on duty were under the impression that no work had been done to the left side and therefore the inspector may well not have been criticised if he had done a full inspection of the (properly fitted) right side only.

Relevance to compliance/risk management: First, people will often go with their own judgement rather than a note if they think that the note is wrong. They will rarely go back to the source material and double check. The more experienced and the more senior may be especially prone to taking such an attitude. Secondly, people will often try to justify doing less work than the systems appear to demand, hence the attempt to differentiate between the boot and the assembly as a whole. Although it was never described in these terms, the night shift inspector appears to have decided that there were changes that did not affect enough of the tail assembly to justify a full inspection.

The evening shift supervisor did not “elicit an end-of-shift verbal report from the two mechanics he assigned to remove both horizontal stabiliser de-ice boots. Moreover he failed to give a turnover to the [in]coming third shift supervisor and to complete the maintenance/inspection shift turnover form. He also failed to give the work cards to the mechanics so they could record the work that had been started. The Safety Board believes that the accident would most likely not have occurred if this supervisor had solicited the verbal shift turnover from the two mechanics he had assigned to remove the de-icer boots, had passed that information to the [night] shift supervisor, had completed the maintenance shift turnover form and had ensured that the mechanics who worked on the de-icer boots had filled out the work cards so that the [night] shift supervisor could have reviewed them.”

Relevance to compliance/risk management: Where line managers who exercise a compliance function allow slapdash work or facilitate the bypassing of even small parts of systems, things go wrong, sometimes in a catastrophic manner.
The supervisor referred to in (f) was a second supervisor. He and his team were working on a different aeroplane. He had more mechanics than he needed for that job and so assigned two of them to the EMB120. It is not clear why it was that second supervisor, who had not worked on the EMB120, who gave the verbal handover report to the incoming supervisor. It was the evening shift supervisor who had no indication that work had been performed on the left stabiliser. Nor did the second evening shift supervisor instruct his mechanics who had been assigned to the EMB120 to give their report direct to the night shift supervisor. Indeed, it transpired that the second shift supervisor did not have a turnover report until after he had told the night shift supervisor the position as he understood it.

Relevance to compliance/risk management: The NTSB concluded that this involvement had the effect of putting the second supervisor in the evening shift in the position of having “assumed responsibility” for the aeroplane.

The second shift supervisor had “reportedly demonstrated substandard performance in the recent past for which he had been disciplined”. The NTSB said: “These examples and his actions the night before the accident suggest a pattern of substandard performance on the part of this employee.”

Relevance to compliance/risk management: While good employees sometimes go off the rails, bad employees rarely make significant improvements in performance. This is why compliance failures become a risk management issue and why strong measures are required for compliance failings.

One remaining question was why the aircraft did not suffer any ill effects on the first flight of the day. The answer was simple: on its first flight, its descent speed was 216 knots. On the second, fatal, flight it was 260 knots, significantly faster but still well within the aircraft’s design limits. It transpired that the boot did not dislodge at the lower speed, although it did “wobble” slightly. At the higher speed it dislodged, folded first down and then under the rear wing and was then torn off by air pressure, creating the stressed locating holes.

Relevance to compliance/risk management:

Often, things do not go wrong immediately. They lie hidden, waiting to go wrong in the ordinary course of business. Effective risk management therefore demands that there should be no delay in identifying anomalies and analysing them, if necessary issuing stop notices until the position is known.

The NTSB concluded that the compliance and risk management systems were “adequate” but that at an operational level mechanics, supervisors and inspectors had not properly implemented them. The report stated: “This evidence indicates that the management personnel of Continental Express failed to ensure the adherence to FAA [Federal Aviation Administration)-approved procedures in the maintenance department, a situation that resulted in the aeroplane being despatched in an un-airworthy condition. Accordingly, the Safety Board believes that there was inadequate [supervision] by the airline’s supervisors and management as well as insufficient surveillance by the FAA.”

Relevance to compliance/risk management: The buck does not stop with the branch manager or immediate supervisor. Senior management, and the company itself, carries the responsibility for the design of risk management systems and for compliance with them. There are two aspects to compliance: internal and external. The airline’s policies and procedures were FAA-approved, so that external compliance had been achieved. Internal compliance failed on multiple levels. The Continental Express case demonstrates the fallacy of concentrating on external compliance.

Hawker Siddeley HS748

The UK’s equivalent of the NTSB is the Air Accidents Investigation Branch (AAIB). It is part of the Department of Transport and as such is a government-supervised regulator which issues both regulations and guidance in a way similar to financial regulators.

One of its earliest available reports relates to two incidents involving Hawker Siddeley HS748 aircraft at Portsmouth Aerodrome on August 15, 1968. The first aircraft carrying 19 passengers and four crew crashed at 11:48 and the second, with 62 passengers and four crew, crashed less than two hours later. In each case, the aircraft failed to stop on the grass runway in very wet conditions and ran into an embankment. The second aircraft continued beyond the embankment, through the boundary fence and onto the main road outside the airfield.

The findings were that there was “an extremely low coefficient of friction provided by the very wet grass surface over the hard, dry and almost impermeable sub-soil of the aerodrome”. That might be met with a “duh!” response but the AAIB said “this condition of the grass surface was both unusual and unexpected”. The result of the investigation was that “the Portsmouth HS758 operation” was “unsafe at the permitted maximum landing weight whenever the grass surface of the aerodrome was wet; on such occasions, the available landing distances were inadequate”.

The investigation report continued: “This situation appears to have resulted from a misconception which arose during the planning stage and led to the omission of necessary landing distance increments based on HS748 grass surface performance data available at the relevant time.”

Indeed, it seems that it was not entirely true to say that the conditions could not have been predicted. There had been recommendations in New Zealand that landing distances for the same aircraft should be extended by 30 percent on wet grass. That was noted in the AAIB report just one page after it noted that the conditions leading to the accidents were unexpected.

It seems inconceivable today that water on grass on clay would not be regarded as a hazard. We all know that to try to drive a car on wet grass on clay results in wheel spin. How, then, was it not patently obvious that to try to land an aircraft in the same conditions could not possibly result in anything like a normal stop? In fact, the AAIB report also included details of tests made on grass airfields at Lympne Aerodrome in 1961 and at Boscombe Down in 1968, but incredibly no one appears to have correlated the studies and produced a risk assessment.

Relevance to compliance/risk management: Awareness is not just about case studies in one’s own industry, and especially not only one’s own narrow business sector. Risks, particularly money laundering and criminal financing risks, are common across many businesses. They are rarely country-, industry- or market sector-specific. Too often, those in the field take a narrow view of what is relevant. The application of common sense in this situation would have told the operators of the airfield and of the aircraft that a heavy downpour on a clay surface covered with grass would result in a slippery surface.

This is of course exactly why Wimbledon tennis championships are suspended during periods of rain: not to keep the spectators dry, but because the players would slide around too much with the risk of injury. It is also why motor racing circuits have “kitty litter” instead of grass on the outside of high-speed corners, because a car hitting wet grass at speed has close to zero retardation until it hits something that does not move.

This case therefore demonstrates that compliance and risk managers should be looking at a much broader perspective than they normally do and should be applying that broader knowledge to their work. An interesting footnote is that when, as a result of these crashes, the landing distance was increased, the HS748 operations were discontinued at Portsmouth, being described as “uneconomical”.

The cost of risk management may sometimes be extreme, but where the risk is known, businesses have to decide whether to take it, and what the consequences might be if they take the risk but a third party suffers (see the Ford Pinto case below). Even though no blame was attached to the airline or crew, the report did say that the airline “may have been well advised to have made their own examination of the results of the 1961 trials at Lympne”. It also said that the airline was “remiss in not passing on to their pilots the information on HS748 grass surface landing performance which the manufacturer has provided”.

Relevance to compliance/risk management: Where information is made available, even from external sources, it must be passed to operational staff.

The Ford Pinto case

A parallel can be drawn with consumer protection. Ford designed a car called the Pinto for sale in the U.S. Shortly before the car went into production, it was discovered that a defect in the design of the fuel system could, in certain collisions, result in a fire and possible explosion. Ford had an alternative design available but the cost of the alternative was an additional $11 per car. Ford conducted a study which said that, over the expected life of the car, the total cost of using the alternative design would be $137 million. The statistical likelihood of the car being involved in exactly the type of accident which would cause the fire, however, was rare, and the total cost of damages for injury, death and other damages would be $49.5 million.

Ford went with the cheaper option, deciding that the wellbeing and lives of customers were less important than profits. In civil litigation, juries repeatedly punished Ford for this decision, although on appeal their punitive damages awards were frequently massively diminished by judges sitting without a jury: in one case from $125 million to just $3.5 million. A detailed analysis of the case is provided here.

The case demonstrated several things:

Even though Ford knew people would probably be injured, maimed or even killed, and willingly put those people at risk, it suffered remarkably little reputational damage.

Information was available, but was kept from the public (indeed, for some time even during the trial the information was hidden as much as was possible).

The risk management cost was actually quite low. It would not have made the car unprofitable, nor turned Ford into a failure.

There is a considerable divergence between the tolerance levels of the public and those of regulators. Where a financial services company is aware that its investment policies are contrary to the interests of its customers, it can generally expect a serious penalty. One example of this was the fine issued by the UK Financial Services Authority (FSA) to JP Morgan Securities in 2010 for its failure properly to segregate client and office funds. This had a ripple effect in that JP Morgan’s auditor, PwC, was fined in January 2012 by its regulator, the Accountancy and Actuarial Discipline Board (AADB), for poor-quality work. A similar failure at Barclays Capital Securities resulted in an FSA fine in 2011 and PwC is awaiting the outcome of an AADB investigation into its conduct in that case also.

Health and safety at work

Nevertheless, many companies do seem to adopt a cost-averse approach to some compliance issues. At the lower end of the scale of complexity for compliance (or so it seems) is health and safety at work legislation, although in fact this is often a minefield of overlapping and contradictory regulations. Where there are multiple layers of legislation (e.g., federal, national or state and local as well as industry-specific rules laid down by membership bodies) the situation is ridiculous. Often, however, compliance in this area is not given the same level of attention as that with financial services laws. Much of the reason for this is that, unless there is an accident for which the consequences are not insurable, the penalties for failure are a misdemeanour conviction and a small fine. Such a case barely makes the general media.

For example, in 2008, a worker at a Nestlé factory in the UK was killed when he was inside a machine which was started by a colleague. The Health and Safety Executive (HSE) said that a device was available to protect against such incidents but that Nestlé had “failed to ensure its employees were aware of its purpose and how to use it correctly”. The HSE said that the company’s safety breaches were compounded by the fact that Nestlé had received written advice about improving guarding on a machine of the type in which the worker was killed back in 2002 but had not applied that advice to the machine operated by the employee, Mr Hussain. In 2012, the company was fined £180,000 in respect of the incident.

The question of how far risk management needs to go is emphasised by cross-referencing to another HSE case concluded in April 2012. A self-employed carpenter working on contract on a project in London provided his own power tools and workbench. The carpenter used his own equipment without guards and cut off one finger and severely damaged another. Notwithstanding that he was operating within his own declared skill level and using his own equipment, the company in charge of the project was found liable and fined £7,500 in respect of failures. The company was held liable for failing to ensure a safe system of work despite the declared expertise and independence of the carpenter.

It follows, then, that in financial services businesses where freelancers are engaged, the level of supervision required can be much greater than previously thought. In the UK, such independent contractors have been regarded as an integral part of the financial services compliance regime since the mid 1980s. Many other jurisdictions are still catching up, however.

Medical negligence

Medicine is a rich seam of failed compliance measures, but just how rich? Rich enough that a Google search for “medical negligence instruments left in patient” produces panels of advertising by lawyers, and the first page of the search contains, exclusively, links to the websites of law firms claiming to have expertise in handling such cases. There is not a single link to an actual case report until half-way down the second page, where a copycat website reproduces a news report from a newspaper website.

A properly run operating theatre has simple systems and controls. Indeed, it might be observed that the more complex the actual task being performed, the simpler the systems and controls are. There will be many systems, which gives the impression of complexity, but each component part of the system is simple. Often, therefore, these amount to counting the swabs before an operation and counting them again afterwards, and similarly other pieces of equipment, including staples. Yet in dozens of cases each year in the U.S. alone, and many more worldwide, these simple processes are not properly followed.

The risk management approach says count things. There is, simply, insufficient compliance with the system that implements that approach. Of course, genuine mistakes are made, but mostly carelessness or recklessness (or even just plain arrogance or don’t-care-a-bit-ness) is to blame in such cases.

Understanding how and why systems fail

It is all too easy for compliance and risk management professionals in financial services to focus on the narrow issues presented by their regulators. Experience from other industries and countries, however, can evidently provide excellent examples of how things go wrong when compliance and risk management systems fail. Examples of this nature are valuable in helping to identify how and why systems fail. They also provide memorable, indeed graphic, examples for staff of exactly why they should make compliance a core part of their daily work, and why risk management systems are not intended to make their lives difficult but, rather, are there to protect them and the organisation.

The blunt truth is that, while staff in a financial institution not may cause an airliner to crash, a burning car to kill people, or cause harm by leaving things inside a patient, they can cause considerable harm to the company and its customers, to their colleagues and to themselves if they are cavalier or worse about the policies and procedures that companies put in place.