Cognitive Operations Three Guiding Principles that Will Transform IT
By Denny O’Brien, Program Director, IBM Operations Analytics Product Management
As an emerging market, IT Operations Analytics (ITOA) has evolved significantly in recent years, extending its focus beyond log management by providing deeper insights into all IT operational data. Those data types include, but are not limited to application and infrastructure performance metrics, events, trouble tickets, and configuration data.
The focus on multiple data types has ushered a new era of innovation within ITOA, one in which we've moved beyond efficient means of data ingestion and into deep analysis and correlation.
Consider the increased adoption of machine learning techniques that are providing organizations with major benefits, such as cost savings associated with improved operational efficiency. The more work we are able to offload to our intelligent algorithms, the more we are able to reallocate resources to focus on innovation across the data center and away from the constant firefighting that has long dictated how we work within IT.
A greater shift away from reactive problem diagnosis and towards proactive early detection and resolution will be delivered with increased investment in cognitive capabilities. The journey into cognitive operations will be driven by three guiding principles:
- Ability to continuously learn
- Ability to anticipate and adjust
- Ability to recommend action
Continuously learn: Setting, maintaining, and reacting to performance thresholds is costly and inefficient. Operations staffs must become proactive to ensure mission critical applications and services are not disrupted.
In order to become proactive, we must first have a true baseline for understanding how our IT environment normally behaves. Manual model building is not a reliable solution and can lead to increased event workload and greater inefficiency across IT operations. Thus, the need for more intelligent systems that can cognitively understand normal application and infrastructure behavior is critical to organizational transformation.
The key here is that machine learning capabilities must be intelligent enough to be configuration free. Most enterprises either have adopted or are interested in adopting these capabilities, but also want the ability to adjust the algorithms based on their own knowledge of the environment. As machine learning capabilities mature with greater abilities to understand seasonality within the environment, organizations will trust a more hands-free approach.
Anticipate and adjust: Customer environments are constantly changing. Greater visibility is needed to understand what is happening, the potential for what could happen, and the overall impact.
Anomaly detection solutions have been around for a few years now, but they are far from foolproof. In fact, the common perception across organizations that evaluate this type of technology is that anomaly detection is the equivalent of predicting an outage. False. The true value here is the cognitive ability to anticipate anomalous behavior based on the system's normal interpreted behavior. As previous anomalous behavior is interpreted to be the new normal within the environment, threshold settings can auto-adjust.
As this space evolves to incorporate more cognitive capabilities, new insights could be gained through the correlation of business and operational data. Imagine the control an organization can gain by systems anticipating anomalous behavior and notifying on the business impact if the emerging issue is left unaddressed.
Recommend action: Searching across across terabytes of structured and unstructured data is time consuming and costly. Intelligent recommendations are needed to further improve Mean Time to Repair (MTTR).
The initial value proposition that guided ITOA was reducing MTTR by isolating the root cause of a problem buried deep inside terabytes of unstructured data: a promise ITOA has definitely delivered. Prior to log management solutions, this task was even more inefficient and often meant sending your log files over to Level 3 support for analysis. If you were lucky, L3 was able to identify the problem in a few days and recommend action.
Log collection, correlation, and search has largely been able to circumvent that process and, as a result, drive improvements to MTTR. Organizations are now much more capable of diagnosing IT operational issues.
The next transformational step will occur when systems cognitively offer recommendations based on contextual analysis of operational data: log files, events, configuration data, or performance metrics. Instead of searching for resolutions, IT staffs will be able to rely on the cognitive recommendations from the solutions managing their environments.
The holy grail then becomes the cognitive ability to automate a fix within the environment. Considering how far ITOA has come in a short period of time, that day is quickly approaching.
About Denny O’Brien
Denny O’Brien is the Program Director of product management for the IBM Operations Analytics portfolio. director. In this role, he is responsible for driving the business direction and strategy for IBM's IT Operations Analytics offerings. Denny has been with IBM for 16 years, and has a background in both user experience and development management.