I recently read an article [https://www.tag-cyber.com/articles/how-to-improve-situational-awareness-for-enterprise-security] called “How to Improve Situational Awareness for Enterprise Security,” by Edward Amoroso. It contains a Q&A style interview with Reggie Best, Chief Product Officer at Lumeta. In response to a question, Reggie said, “In large, physical, static networks, where one can still point to much actual infrastructure sitting in data centers or closets…it is customary for 20% or more of assets identified in a competent scan to be unknown to the IT team.” In company with 10,000 devices with IP addresses, 20% would be 2000 devices. That’s a lot of unknowns to deal with. Artificial Intelligence has a significant role to play in unknown device and service discovery.

He goes on to make the point that networks aren’t static. They constantly are in a state of flux and change, containing transitory mobile endpoints, virtual applications, virtual network functions, private and public cloud, and the emergence of software-defined everything. The best scanning tools will miss that activity. Some academic research indicates “as much as 40-50% of endpoints are being missed by periodic vulnerability scanning with the proliferation of mobile.”

To me, these unknowns represent a potential growing attack surface. If a known host on the network starts communicating with an unknown host, a security person has to take notice. If I can’t see it, I can’t control it, and if I don’t control it, someone else does and if it’s on the network, it can represent a risk to the organization. Discovering these “missing hosts” can take a great deal of coordination between IT operations, line-of-business owners, and the security team.

Detecting these unknown hosts is best done by a system that can use information about known hosts to infer their existence and in some cases, identify services they are running. A form of AI (artificial intelligence) called Machine Reasoning has the potential to do just that. When analyzing our log data to interpret what’s happening in our network we mostly look at what’s in front of us. We go through mountains of data laid out as key-value pairs. Some of this data has source and destination laid out as src=192.168.0.1 and dest=10.10.140.5. We assume the destination host exists because it is mentioned in our log data. Yet, we have no log data from 10.10.140.5. To someone who didn’t look closely, this might seem normal and they wouldn’t give it a second thought. Should we be concerned?

An AI-based system would use inference to bring it to our attention. It would use data from 192.168.0.1 to infer the existence of the other host and point it out as possibly a webserver or a server using a REST-based API. It could have been left over from a time when we had a temporary supply chain partner who needed access to our network. The point is AI inference is an important tool for discovering apps, databases, hosts and whole network segments that may not show up in traditional network scans.

AI can look past data we have and help us determine what our data means.

For more information about our use of AI, please download [http://info.geminidata.com/SituationalAwarenessWP] our latest white paper.