Anyone can program for normal

Throughout my career as an Automation Professional, I have noticed two distinct types of programmers.  Those that just take what the spec says and programs for that and those that use the spec as a guide to develop a control system.  Unfortunately, there are no courses on the differences in the two for automation folks.  For some reason, automation programmers feel there systems don’t need the kind of rigor that traditional software development professional follow.  In fact, I was told by one quality manager for a major systems integrator that they did not develop software and therefore, would not receive value from software development tools and methodologies. 

The classic battle lines between Engineering and IT would tell us there is not much Automation Professionals can gain from our IT brethren.  I disagree with this assertion as software development life-cycles and more specifically, software testing is an area that the Automation Profession has much to gain.

I am reminded of these principles when hearing of the one of the Toyota sticky gas pedal fixes.  Some brilliant person decided to add a “feature” that would disengage the throttle when ever the brake pedal is depressed.  Really??  Is that not what every single cruise control device already does?  How could something so simple be missed during design, development and testing of the software controlling that process?

Thinking back to a man-made disaster at Taum Sauk Generation Station in Missouri (http://en.wikipedia.org/wiki/Taum_Sauk_Hydroelectric_Power_Station).  It is a simple enough concept, a pumped storage electrical generating system.  During the day, a reservoir of water at the top of a mountain is allowed to flow through two generators creating electricity.  At night, the generators are turned into motors and the reservoir is refilled with water from the lake.   Over time and through cutbacks, the once fully manned station became remotely monitored and controlled, not that it would have mattered in this instance.  One night, during the pumping cycle, the level measurement system failed and got stuck at a single low reading. Well, the pumps never stopped and overflowed the resevoir which quickly eroded a poorly constructed wall which emptied the full reservoir in a matter of seconds.   The environment damage was significant as it created a trench down the side of the mountain and a lot of flooding.  It nearly killed the Johnson Shut-in’s Park Ranger and his family as they woke up to their house floating down the river.

A lot of emphasis was placed on the poorly constructed walls but what about the poorly developed control scheme.  This “accident” should have simply never happened!!!

We can assume the level control was simple enough.  When level gets low, stop the generators and turn the pumps until level is high and stop the pumps.  Now, this is a test for all of you readers.   What other things could we have interlocked the pumps on.  Well, how about stopping the pumps if the level does not change for a defined period of time thereby indicating something is wrong.  Or, since this system has been running for years, someone should have had some idea how long it would normally take to fill the reservoir.  So, lets stop the pumps after a certain total amount of time has elapsed.  That is the simple stuff and would have cost a few extra key strokes in a program.  Even if the system was still relay based, a PLC could have been implemented just for this piece at the costs of less than $10,000.  It would be very interesting to learn what really happened and how many engineers made the call that this could happen and some bean counter could not justify the additional safety measures.

What other failures do you know of that could have been easily avoided had someone thought through the problem from a failure standpoint instead of everything is normal?

How many of these type of failures go on everyday in our plants?  How many times does a knowledgable operator keep a control system from making a dumb mistake because of an arrogant programmer?  I believe the numbers are much higher than we would like to assume.

So, the next time you are out to start your next programming assignment, take a few minutes and think about what could go wrong and how your design will react to it.  It really is not that difficult to take care of a majority of the issues any system will face.

Is the next food borne illness waiting in the wings?

I am sure you have all heard about the various food borne illness outbreaks and the tragic results that have killed and sickened people.  In every case, it takes the FDA several weeks to identify the source of the contamination and the chain of custody.  To recap, we have had these issues in peanuts, tomatos, spinach and hamburger just in the last 2 to 3 years.

 As an American citizen, this is infuriating.  The laws addressing this issue are clear and concise, although not originally developed to address this specific issue.  The law  was drafted to address the intentional contamination of our food supply but the non-intended benefit is that it also will support non-intentional contamination.  The Bioterrorism act of 2002 clearly outlined the types of records to retain, the length of retention and more importantly, the response time to an incident.  Essentially, the rule reads that from the time a manufacturer is alerted to a potential issue, they must respond with the products genealogical records with-in 24 hours.  So, when the FDA calls me and tells me they think I have a problem, I have to turn over all of the information I have related to that specific lot number which must include point of origin, raw material point of origin, manufacturing process data, etc.   So, why does it take so long to identify the source and contain the problem?

As an Automation Professional, it is equally infuriating.  All of the major software vendors modified their products to support this important initiative and built large marketing campaigns to advertise the features.  Unfortunately, the industry response was essentially zero.  I personally cannot point to a single project that was originated due to this regulation.  I can also tell you that most producers I have consulted with since this act has been on the books has not followed the spirit of a mock recalls.  They all maintain a separate file of information for any mock audits that come up.  In fact, one client uses the same information for each audit.   I guess that FDA inspector is really on top of his game.

As in all tragic situations, there are opportunities for improvement.  In this case, the improvement solution is already clearly defined.  There is simply not enough “justification” for producers to take the time to install this functionality in their systems. I am somewhat surprised some enterprising young lawyer has not jumped on this since these producers are conscientiously not taking the sufficient steps required by law to protect the public.  The outbreaks are becoming more frequent and they are taking longer to identify the source.  Statistically, that would mean we are still on the hill moving up and will probably continue to do so until the “justification” for installing these systems works.  That will probably take a major legal battle but I predict it is coming.  I have to believe that there are a large group of producers that have implemented the systems to follow the intent of the law.

The Automation Profession can easily solve this problem if the accountants and MBA’s would get out of our way and let us do our jobs.  We have an obligation as professionals to protect the public.  I wish other professions followed a similar ethical practice.  The odd thing is that by implementing systems like this, overall production effectiveness would increase and would drive bottom line results.  MBA’s and accountants don’t have that sort of long term view as they cannot seem to make it past there view of the world from SAP.  The current state of the American and global economy is a clear picture of that.

I hope recent consumer trends maintain and that we all get a clear view of our food supply chain.  I want to know where my food is coming from and make sure it is not coming from China or other country’s with relaxed consumer protection regulations.  It is a shame that people must get sick and die in order for decision makers in companies to do the right thing.