IT as a Service (ITaaS) is changing nearly every day. In the past, it was mainly about automating deployment through the contents of a service catalog. Today, it has grown to include IT operations analytics (ITOA). What matters isn’t whether we can select an application from a service catalog, but rather how we monitor and react to issues during the lifetime of the application. With containers, which are all about automation, ITaaS has to change not only to include ITOA, but also to react to the results of the analytics.
What do we mean by “react” in this context?
- Define the application: Determine the application’s first, second, and third-order dependencies automatically.
- Self-heal: ITaaS should fix any problems that it can automatically, such as by bursting to the cloud or increasing capacity.
- Open tickets: Based on policy, ITaaS should automatically open problem tickets and either (a) perform self-healing or (b) raise awareness of an ongoing problem.
- Set priorities: Based on policy, threat feeds, and business-related data, ITaaS should set priorities on self-healing, tickets, and other automated functions.
Good ITaaS involves all aspects of ITOA to make better decisions, to respond to problems, and to fix those problems as required. ITaaS with ITOA embedded can become the systems engineer many organizations are missing. This will require most ITaaS tools to be smarter than they already are. It also means ITaaS with ITOA must leverage knowledge within analytics beyond just logs, performance data, and capacity data.
We need to include such things as comparisons to development, inspection of virtual and physical hardware (such as chipset features), and understanding of error states and messages of our hardware. We need to look at logs not as single lines of data but as groupings of lines interleaved with other groupings. This lets us start to understand the real aspects of the systems that make up our virtual and cloud environments. The hybrid cloud needs better analytics, but those analytics have to be based on knowledge, not just performed because the data is there.
Most importantly, the knowledge needs to be based on the application. In order to determine whether there is an issue with the application, we need to see the entire picture and not just one small part of the whole. For example, if we look at an application’s database, we may see an apparent issue. However, though it may look like an issue, the application may not be affected. Without knowing the application, how can we judge the impact? If the application is not affected, business rules come into play that say to log a low-priority issue. Yet, the database team considers it a high priority and works on it, and other items important to the business slide.
This is why ITaaS must include business logic as well as all other aspects of the application. Without a good definition of the application and its dependencies, judging that impact is difficult.
Define the Application
When we look at the current spate of products, we see a few showing us the definition of the application as a tree that can change views to include or exclude hardware. Some graphs do not show dependencies, just traffic, and leave us to guess the real dependencies. Others look at specific components of the application, such as the virtualization layers. Still others require input to define the application: input from a human or another tool.
While there are many products in this space, I would like to call out Zenoss as being able to show dependencies at various levels all the way down to the hardware. Another contender is VMware vRealize Infrastructure Navigator, which, while incredibly powerful, seems to be the redheaded stepchild among VMware’s products. Several security tools, such as Illumio, also can provide this sort of data.
Self-Healing/Open Tickets
Many products claim to do root-cause analysis but do not necessarily do self-healing. This occurs mainly because self-healing is automation, and fully automating this worries quite a few people. This is why it is important for self-healing approaches to not only open tickets, but provide a list of fixes that can be verified and then pushed out. Once the organization is comfortable with the results, the human intervention within any self-healing workflow can be marginalized and eventually disappear. Yet, without proper threat feeds and other feeds, automation without proper priorities may not be something most people would like to see. I know at least one application that requires a misconfiguration in order to work. Self-healing would break this application, which is where priority within issue tracking comes in very handy.
Nearly every tool I know about has the ability to call to an automation tool to fix some problems, whether by using its own orchestration tool or by using Puppet, Chef, or others. However, very few automatically open tickets, which should be part of any priority-setting workflow for self-healing.
Final Thoughts
ITaaS has grown up. Today, it includes not only ITOA but also capacity, performance, and other management structures. The future of ITaaS requires ITOA as well as business, threat, and other feeds to properly drive a self-healing and self-documenting environment.
As workforces change, a self-documenting environment is a step in the proper direction. It is crucial to avoid loss of definition of the application. Do you know the definition of the application most important to your business, as well as all its dependencies, microservices, and APIs within use?
If you do not, it is time to find the proper tool to acquire this information as quickly as possible!
Nice article. Your thoughts on a self-healing and self-documenting environment are thought provoking. Since, you have already practised & researched in depth would like to request if you can enlighten with exclusive post on the characteristics & capabilities that you envision for the same.