The IT Infrastructure Library ITIL- germinated in the UK with the goal of creating a set of universal standards for delivering high quality IT services. Twenty years later, growing application complexity, increasing demands to improve service levels and regulatory pressures are converging to finally drive broad adoption of ITIL on this side of the pond.
The IT Service Management Forum ITSMF- and other IT membership organizations are growing at an exponential rate. And, analyst groups such as Forrester Research estimate that ITIL adoption among billion-dollar companies will increase to 40 percent in 2006, and reach 80 percent by 2008. As companies rush to deploy ITIL, the newly minted ITIL managers often feel overwhelmed by the vast process framework, unsure as to how to get started and make ITIL a reality in their respective organizations. Even after the initial adoption, ITIL managers lack effective means to report on ITIL success and reinforce newly implemented process standards. The key to addressing both challenges lies in two tightly related ITIL components - incident management and problem management. ITIL Incident and Problem Management In ITIL terms, incident management is focused on restoring normal service levels as quickly as possible with minimal disruption to the business. It s a highly visible part of the ITIL process, and if done well, can result in reduced service interruptions, increased efficiency and improved user satisfaction. A Forrester survey of companies over $1 billion showed that incident management is the number one ITIL priority among IT executives. In contrast, problem management takes a longer term view and is tasked with reducing the effect of incidents and errors, and proactively preventing them. A well-thought out problem management system will reduce recurring incidents and create permanent solutions instead of just one-time fixes. Due to the critical nature of these two processes, incident management and problem management are usually selected as logical starting points for any ITIL implementation. In IT as well as in life, a good starting point is important, but key to the success of a project is sustained execution. Well-trained staff and good process design, though crucial, are not enough to achieve ITIL success. The all-important yet often overlooked third component is having a comprehensive tool set to support the process roll-out. System Management Products Stop at Incident Detection Most large enterprise IT organizations have deployed system management products such as HP Openview, IBM Tivoli, CA Unicenter, BMC Patrol, Mercury Business Availability Center and Microsoft Operations Manager to help monitor their infrastructure and applications around the clock. While this allows for proactive detection of incidents and contributes significantly to effective incident and problem management, it s far from enough. Monitoring products stop at detection, and provide minimal utility when it comes to triaging, diagnosing and resolving the incidents. This critical resolution process still takes place outside of the monitoring tools, and still requires a human to perform a series of tasks and procedures to determine the root cause and the correct fix. Service Desk Solutions Add to the Puzzle with Incident Tracking Another class of software tools that is widely prevalent in the ITIL tool-box is service desk software. Common solutions include BMC Remedy, FrontRange Heat, HP Service Desk or HP Peregrine Service Center. These products track incidents, problems and change processes, but also fall short when it comes to providing resolution capabilities. In addition, despite the built-in reporting capabilities offered by these solutions, many large IT organizations still struggle with gathering factual data to aid in their incident and problem management efforts. This is because service desk solutions still rely on human operators and manual data entry. Names, descriptions and categories are often inconsistent, and the resulting data suffer from the garbage-in, garbage-out syndrome. Automated Incident Resolution Helps Bridge the Gap There is a major gap that remains between the incident detection capabilities provided by system monitoring tools and the incident tracking capabilities provided by ticketing solutions - the triage, diagnosis and resolution process is still repetitive, manual, and prone to error. Alerts are often escalated wholesale, alert floods remain unchecked, troubleshooting and resolution knowledge tends to be poorly documented. IT professionals are forced to operate in fire-fighting mode rather than proactively addressing the root causes of incidents and problems. Unfortunately, the ITIL goals of minimizing service impact and preventing recurring incidents remain elusive to many IT organizations. After years of relying solely on home-grown scripts, enterprise IT organizations finally have a viable alternative - software solutions that automate the critical step of incident resolution. Gartner, a leading industry analyst firm, has dubbed this emerging category as Run Book Automation and has published several reports in this area since June 2006. A few early movers have brought sophisticated solutions to market that promise to fill this gap and round out the ITIL tool set. Run Book Automation solutions automate an array of IT operational procedures to help with the traditionally time-consuming process of triage, diagnosis and repairs for business-critical applications. These range from simple tasks such as checking network connectivity, stopping and restarting services, and changing system configuration to complicated, nested procedures like running a full set of diagnostics tests against a clustered J2EE application environment, gracefully removing a server out of a load-balanced cluster, or interrogating backed up MQ queues and automatically routing the problems messages to the correct location. Early adopters of Run Book Automation include companies such as Alaska Airlines, Halliburton and NYK Logistics. Combined with existing tool sets, each has realized significant ROI in their ITIL implementation efforts. Dean DuVall, Alaska Airlines Managing Director of Customer Services said, An automated, repeatable problem and incident management solution that scales with our business has allowed us to empower our first level resources and offload work from our second and third level support teams. These teams can now focus more of their energy on non-routine issues and strategic tasks to improve overall service levels. Reinforce End-to-End ITIL Processes with Run Book Automation Together with system monitoring and help desk tools, Run Book Automation products result in rapid resolution and form a fully automated incident management loop. Monitoring solutions proactively detect any service outage and issues an alert. The alert automatically triggers automation procedures to triage, diagnose and fix the issues. The Run Book Automation solution captures inputs and outputs at each step into the ticketing system to provide a detailed audit trail. Upon resolution, the ticket is automatically closed with full resolution information. In addition to significantly reducing MTTR and service impact, the increased visibility helps tier two and tier three support teams with successful problem management. With automated data capture and enhanced reporting features like MTTR Trending and Incident Analysis drilling down to Configuration Items, ITIL project teams finally have the data they need for detailed incident and problem analysis at their fingertips--not to mention metrics that capture the complete set of before and after pictures for key performance indicators KPIs- like MTTR, service level and operational cost savings. To summarize, new Run Book Automation solutions drive maximized uptime, lowered support costs, and increased operational efficiency. This comes from empowering frontline support to perform more troubleshooting and repair tasks, freeing up senior support staff for proactive management. Last but not least, the IT organization will also benefit from better overall process control and a well documented audit trail for Sarbanes-Oxley. Going Beyond Incident and Problems Once you ve paved the way for ITIL success with effective incident and problem management projects, you can branch out to change configuration, as well the other ITIL service management and service delivery modules where you ll learn that the automation solution you ve adopted for incident and problem management can also help in other ITIL process areas, and contribute to your overall ITIL success. Source: line56.com