Security at Scale: User Behavioral Analytics

Recently I was invited to participate as a delegate at Tech Field Day 16 in Austin, Texas, where we visited with Forcepoint. Forcepoint is a company with a combined portfolio that includes user and entity behavioral analytics (UEBA). UEBA’s primary focus is determining what is normal for a user and then deciding if a given behavior is risky. Most of security at scale is about weeding out those things that may be an issue from those things that are an issue. The how of UEBA is as interesting as the why—as in “why do we need this today?” Once more, we get back to scale: specific scale including not only the velocity and volume of data but also the quantity of users and entities involved. Let us look at UEBA in some detail.

UEBA is at its core an analytics program, one geared to understand security elements within the data stream, including elements such as login and logout, source and destination locations, protocols used, and ports accessed—all of those things you would expect to see within any good security tool. The key is that Forcepoint does not just put out a behavioral analytics product without some thought. Each customer has some uniqueness with respect to security, so their UEBA is data-model driven. That model needs to be seeded with an organization’s approach to security, data streams, etc. Then, the analysis engine needs to learn what is normal.
Have you ever examined the queries handled by something like a DNS, directory, or web server? Busy systems handle many requests per second. Each request is part of a pattern outlining a user or entity’s behavior within the realm of an organization. Even in small shops, such behavior generates millions to billions of data points per day. As an organization grows, hundreds of billions, if not trillions, of data points are generated per day. Behavioral analysis is a way of taking that massive amount of data and deriving a score indicating how risky a user’s behavior is to an organization. No one data point defines the risk, it is multiple data points that do.
Forcepoint integrates not just standard user behavior but also feedback from its data loss prevention (DLP) tool and eventually from its cloud access security broker (CASB) to get a score related to the riskiness of a given user or entity. This implies that data has once more grown, perhaps to ten times its previous size. The velocity and volume of data can be daunting, and no human can keep up. This is why we need behavioral analysis tools. However, the important bit of UEBA is the word “entity” within the acronym.
The entity refers not to the human user but to the machine user or even device. Everything has a normal footprint of behavior, from a server to a network device. When that behavior changes from the norm, operations, development, security, and the business need to know! It could be a lateral movement, or it could be normal behavior that was undocumented. As we move faster and faster, sometimes the nuts and bolts of what talks to what gets lost in translation between teams. There are even some who no longer document things as other teams would like, all in the name of speed.
Regardless of the reason for the suddenly abnormal behavior, we need to understand the “why” of the behavior. The “why” could be something like a change control request. In no way should a change or drift in a configuration be a snowflake: a single item unrelated to other items. Change control requests help us by ensuring we know the whys of a potential event. In reality, UEBA aids us in understanding those unknown events.
How can UEBA detect what is different between an administrator and a really good electronic copy of an administrator? In most cases, the electronic copy is trying to do something abnormal—sort of like taking a left turn where the normal administrator takes a right or downloading more data than normal. But what if it was a server that was infected and copying what an administrator normally does, or an API, or some other protocol? While slightly different, it still ends up being a behavior that can be tracked.
The key to user and entity behavioral analytics now is the data model in use, but eventually it will be the ability to detect at finer granularities and with better machine learning mechanisms. We are still in the first-order analysis phase of this type of behavioral analytics. Getting to second order, where we can track unknown events over time against time, is an important step. At the moment for many security companies, that is on the horizon, not the here and now. At the same time, the engine needs to learn what is good vs. what is bad, and that takes some up-front threat intelligence and knowledge.