• sh7786

When even the best won't cut it: the case for unsupervised learning

Anomaly detection systems have quickly become a standard network cyber security layer. This is in large part because anomaly detection and management systems offer proactive and preventative security via real-time detection and alerts, as well as quick mitigation.

Anomaly detection can be performed using a variety of methodologies for every step of the process (learning, detection, investigation, mitigation, etc). In our previous blog post we discussed "learning process" methodologies, specifically a method called "Supervised Learning." Today we're coming full circle, turning our focus to the method called “Unsupervised Learning."


To recap



In unsupervised learning-based detection, the system goes through a learning period without anyone verifying the data that the system learns. The system assumes that what it learns during the learning period is legitimate. This is a considerably less expensive process, but there is a down side; there is no data validation, and mistakes in learning occur, which affect the detection accuracy.


So why unsupervised learning?

Despite the fact that supervised learning is the preferred method to guarantee better model accuracy, there are cases in which unsupervised learning is the required method.

Let’s look back at the example we gave in the previous blog: Say that during the learning process the system sees a message that has 2 mandatory fields and 3 optional. The system looks at these fields and goes to the reference model to check that the existence of these 2 mandatory fields and 3 optional is legitimate. It sees that it is in fact legitimate. So the system is now allowed to learn the pattern “for this network, message x uses 2 mandatory fields and these 3 optional.”

Now, during the detection phase the same message comes in, but now it has the 2 mandatory, the 3 optional and a 4th optional which it has not seen in use before. This is the anomaly that the supervised method has trained the system to catch.

So what happens if the same message was observed at a rate of 1000/second during the learning process and we now see it at a different rate? Is this ok? This is where unsupervised learning comes into play.

No one can provide a predefined reference model to the volume of messages, but the system can learn the expected volume and alert on significant deviations from the learnt pattern. Unsupervised learning is ideal in the case of volumetric anomalies, which cannot be detected using supervised learning as they can’t be defined a priori.


And monitoring volumetric events is critical to network security

Say an attacker would like to map the network's elements to gain an understanding of which element is which, and what valid interfaces and ports are defined per each element and capture any additional intelligence she can get from the returned messages.

An attacker can start generating valid requests (valid in terms of protocols' format) to the network to see which element replies back, in which protocol and port, and what information is returned in the messages - thus performing extensive scanning. An attacker can even send requests per IMSI and map out valid IMSIs vs invalid ones.

Ultimately, with this collected information the attacker can gain an understanding of the network's architecture and design more advanced attacks, which can cause severe damage ranging from service degradation to private data leakage and espionage.


How does monitoring volumetric events solve this problem?

Since each network element has its own set of allowed interfaces, communication ports and valid message fields, it is likely that most of the generated requests will result in no response or even error messages. imVision’s Anomaly Management Platform, in turn, will detect this scanning scenario, even in cases of low volume attacks, since the volume of invalid requests/response ratio is monitored on a regular basis and in fine granularity, and compared to what we see in the field.

Anomaly management is, by nature, highly dynamic, and so setting a hard-coded/pre-defined threshold- even if there is indeed the knowledge and/or ability to set it- may not be appropriate for all situations. Which in turn can lead to many false alarms and missed detection opportunities.

This is the advantage of unsupervised learning, which can be quickly updated and re-adapted to the real data.


imVision has you covered

imVision's Anomaly Management Platform (AMP) for mobile networks employs both advanced supervised learning methods to optimize detection and reduce the rate of false alarms, as well as unsupervised learning, to provide a holistic and overall solution to network anomaly detection.