Are You Having an ‘Odd’ Day?
By Guy Warren, CEO, ITRS Group
When I was talking to a client who is monitoring over 5,000 servers which support their commercial banking systems, they were telling me about the analytics that they do and the insight they have on their systems.
Using a real-time monitoring tool, they have set up thresholds for performance metrics such as CPU above 90% for a given period of time. This is not unusual and is a normal 'detect' use case that you would expect to find in a monitoring solution.
What was unusual was that they were logging the CPU values every 5 minutes into a database, and building a curve of the average values for the CPU through the working day, based on 3 months, 6 months and 9 month averages. Then, they were comparing the curves against each other to see if the pattern is changing over time, and even more impressive, they were comparing the current CPU values against the curves to see if the current CPU load was statistically different from the 'norm'. If it is, then an alert is sent to the monitoring solution.
This is a great example of the kind of insight that ITOA is now bringing to IT support professionals, and allows them to determine if they are having an odd day. The causes may be many and varied, from unusually high activity (or low activity due to a failure!) in the market place, to changes caused by a recent release of new software, to a progressive failure of a part of the infrastructure.
Clearly, this statistical approach is one form of anomaly detection but it is not the only one. It is also possible to look for pattern based comparisons, where the expected shape of a signal is analysed to look for variations from the expected/norm.
Increasingly, as IT systems become more and more critical to the running of a business, and the reputation and value of the company is impacted by IT problems, this level of insight will be expected to help ensure that IT is performing 'normally' – whatever that is!
About Guy Warren
Guy Warren is the CEO at ITRS. Guy Warren brings more than 25 years' experience in financial services and technology businesses. Guy had previously been a customer of ITRS, using Geneos to improve the availability of the real-time systems when he was COO of FTSE Group. He has also managed a financial services product business as the EVP and General Manager of Misys Banking, a leader in the core banking solutions market. Guy has a strong background in large blue chip organisations and more recently working with private equity backed companies. Earlier in his career, he was the CEO of Logica UK, a large professional services company.