Everyone wants visibility into their hybrid cloud of all resources and subsystems. We have expounded upon this need over the years as well as on how to gain some level of visibility. The tools exist, as do the methodologies. What we need now is better observability. Visibility is inherent in many tools today, but observability is not. There is one observed basis in every tool to the visible data; we need to go past that to gain better insights.
This is the power of big data analytics: the ability to ask one question, the basis, and extend from there—to try and ask more of our data than what has been observed before. This is observability. Those of us using data need to recreate ourselves not as just data scientists but as observers of interactions. We need to observe something within the world, then inquire of the data related to that observation.
The ultimate goal of big data is for a business to ask of the data anything and get a valid response. For example, you may wish to know the relationship between the lengths of beards and skirts and the price of gold (which is a question asked in the book Friday by Heinlein). Yet at the moment, it takes a human to correlate all that information without a set of rules. Today, computers need some set of rules by which to correlate data. A human can look at disparate concepts, data, etc., and come up with an answer. The more we observe, the more questions we ask.
The strength of the human mind is insight, and many tools are starting to provide the ability to apply your own insights to your data. New Relic, Splunk, Prelert, Dynatrace, and many others have the means to allow you to make inquiries of your own. But at the same time, there is a lack of data outside the basis. The key here is to bring in even seemingly odd, raw, or bizarre data and see if there is a question that can be asked that includes it.
For security purposes, we want to minimize the time to detect a breach. Some odd ways to do this would be:
- Include the calendar for the organization and ensure locations are in it. If you see abnormal usage from a user who is on vacation in Brazil originating from Toronto, there is something to investigate.
- Include event data such as deployment of a new release. This is now the new norm, and the old norms may not apply. Without this knowledge, we get quite a few false positives until the norms settle.
There are many other aspects of an environment and people we could include. Perhaps we want to include the weather as part of our decisions to scale up or down our environments. Even the act of scaling up or down changes our environment so that norms are no longer normal.
We need to add observability into our mix: not just visibility, but observable items. The human needs to become part of the equation once more and not rely just on the data. We need to drive the data in some fashion. We need to collect even more data to remove false positives.
We observe the world around us; we make millions of small decisions a day. We process huge amounts of data, and this is what we are trying to duplicate with big data: the ability of the human mind. To map what humans observe to create a repeatable result with the end goal of removing false positives.
Now think further, to the world of IoT, where there are millions of devices all talking to each other and their data is being stored somewhere for analysis. Think of the questions you could ask of that data, of the raw and bizarre data that could be used as part of correlations. Perhaps someone will make an observation that ties everything together: the fabled unified field theory, perhaps.
Using the tools available, take your data for a spin to the edges of its capabilities. Find out if it has what you have observed. If not, find a way to add that data into your mix and query again. Move past the basis of your data and be creative. Do not just rely on the basis used by the data: ask more of it. We have the tools; we now need the observations!