Do We Control Our Data?

Data management is a must as we move up the stack. Data management includes data locality, integrity, confidentiality, availability, and protection. In other words, the old concepts of data security, protection, and classification still apply. However, with the advent of virtualization, our data sources changed. As we moved into the cloud, they changed once more. Now that we are entering into the realm of distributed systems, data is once more changing, not just in location, but in velocity and quantity. Data is literally changing our lives. Whether or not that is overall for the better is a discussion for another time. With all these changes, do we control our data? Is the data we place somewhere within our control at any time? Can we gain back control if it is perceived to be lost?

The crux of digital transformation is not the realization that we need to use containers, but the realization that we have data everywhere and no idea how to control it. That control requires us to change fundamental viewpoints, processes, and procedures. It requires us to change our mindset. One fundamental change is the acceptance that at some point in time, data will be outside our control. How we apply our tools to gain back some semblance of control is crucial. The mind shift has to happen due to where we started. Figure 1 is a typical data center approach even if virtualization or private cloud exists. In most cases, the client device is outside the enterprise’s control, such as those used by customers and many used by employees (but not all).

Traditional Data Protection — Figure 1: Traditional Data Control

In the traditional data control approach, each step of the path of your data from the client device to the data storage location (database) has its own particular flavor of security, control, governance, and risk assessment. The list of tools for these approaches is a mile long. Data is basically either in flight or at rest. Sometimes, data in flight is decrypted and then reencrypted after some set of actions takes place (such as deep packet inspection). Regardless of what happens in the data path, the data is manipulated. Each element of the path has an associated risk. However, as illustrated in Figure 1, every aspect of the path is within your control, or the control of your IT team. In many cases, this also includes the client.

When we move to the cloud, we start to lose some control. In fact, we tend to lose a lot of control, as we see in Figure 2. Our area of control shrinks considerably, to, in effect, just the application itself. While the services we require are still around, they are now controlled by the cloud service provider and not by ourselves. This loss of direct control leads to the need for better auditing. The user of the cloud and its services needs to gain back the feeling of control by being able to audit exactly what is happening within those areas they do not control. We do that by hooking into the log stream from the cloud, such as via AWS CloudWatch. We also want to know more about who did what, when, where, and how. This introduces the use of Cloud Aware Security Broker (CASB) appliances and tools. In effect, we layer on more auditing to get a semblance of control. This mindset shift leads to Figure 3.

In Figure 3, we are moving most of our controls into the application space. We utilize the cloud for its tools but in a more simplistic approach—simplistic in thought, but not in reality. We now make decisions about where to place security and control based on what we have access to. For example, we might want the cloud to do DDoS protection but limit the firewall capability to gross level of control: i.e., block all but what we require. This approach simplifies how we use external resources, but not their impact. However, we now introduce micro-segmentation within our application, as well as other In-App controls such as data protection (erasure coding), firewalls, service availability (and caching), and data transformation and protection rules for data going into our databases.
In essence, as we move up the stack, we also need to change how we do security, processes, and procedures to gain back the same level of control. Some applications lend themselves to learning these lessons before all else. Those are systems that already do billions of queries per day. At that scale, you need to move things around your stack just to gain performance. However, those lessons are also applicable to transforming your business and applications.
Do we control our data today? I would say most do not. Yet, they are starting down the path of discovering just how much control they actually do have. Some are approaching this from a legal perspective, others via technology. In either case, to gain back control, you must first change the mindset of how you manage data. Have you?