Last week it became very apparent what the impact can be of event management and change control going wrong in an IT department. The National Australia Bank had a ‘glitch’ when a batch processing cycle went horribly wrong.
First of all I am intrigued by the way the media refers to it as a glitch – considering the fact that thousands of clients have been negatively affected. (and still are after more than a week) The bank’s PR department must be working overtime to streamline this process.
NAB conducts batch processing on behalf of other banks each day. When completed, a file, containing a detailed transaction history, is generated, which is then sent to the banks by NAB at the end of the day.
On early Thursday morning, IT departments at financial institutions such as Commonwealth Bank, Westpac, ANZ, HSBC, Citibank and Bank of Queensland went on high alert when they did not receive the files.
NAB told them that “technical issues” had hampered the delivery of the files. The widespread ramifications were immediately clear to all stakeholders: the inability to reconcile accounts would be a disaster.
Since the news broke, NAB has blamed a “corrupted file in the processing batch” as the cause of its nightmares.
However, it apparently was not a “file” itself that was the problem. Instead, it appears that someone from NAB’s IT department who had access to the system inadvertently uploaded a file that “corrupted” the system.
NAB spokesman George Wright described this as a “fair” statement as he tried to explain exactly what went wrong.
HOWEVER – you can also look at it from an opposite point of view. After reading the IBM mainframe discussion forum and debating this with my (very technically savvy – ex mainframe programmer ) husband, I am AMAZED and in awe with the fact that these types of glitches don’t happen more often.
How can you run a batch processing schedule every day /night of the week which requires multiple OPC/ESA or CA/7 jobs linked together with thousands and thousands of JCL statements based on 30 year old legacy code without daily glitches that end up in the mainstream media??
JCL statements literally manage, access and change thousands of files and databases. So with this mind boggling complexity I can only say that NAB must be running very strong event management, availability management, testing and change management processes. This has to be the secret to their success!