What’s common between staying in the best part of town and managing IT?

Correlation of Accommodation by Geo-location

Travel is fun if you know where you want to go. Usually, I don’t have a lot of personal time to visit all the places I wish I could have, so planning is important. Over the years I have amassed a collection of saved bars, restaurants, and other places of interest to visit should I happen to be nearby. Most of these I have read about at some point, and then just bookmark them for later.

For a recent stop-over in France, I used a map view under bookmarks in Yelp to show the places I had saved for Paris. Then I did a map-based search in Kayak to find the best hotel within my price range and closest proximity to all those Yelp bookmarks. It worked out really well. I was within blocks to where I wanted to be for the one night I was staying over, and I had a blast.

When discussing this “life hack” with Navin Ganeshan, our Chief of Product, he observed “I’m surprised there is not one application that would let you do both”. I started thinking, both Yelp and Kayak are purpose-built applications. Yelp helps tracking and rating consumer businesses and places of interest, and Kayak focuses on flight and hotel fares. Although they both care about geographical locations, the overlap otherwise is limited. Because both applications are built for dissimilar use cases, there is going to be a difference with how the data is stored in their respective schemas. So to solve a problem of booking a nice hotel surrounded by your favorite bars would require another purpose-built solution.

Universal Understanding of IT Data

Overwhelmed with data from point solutions and multiple ways of describing the data, the cyber security industry has given up on the idea of data-driven infosec management that we aspired to during the heyday of SIEMs. No single solution can understand all of this data, and those that try require extensive customizations. Many lengthy and costly projects are focused on singular use cases, which frequently result in shelfware. One of our customers had spent over $250k in software and services to deliver a solution to integrate their issue tracking application with a change management tool. The system was built to issue a single type of alert and literally did nothing else. It was scrapped after one of those applications released an update later that year which provided the same functionality.

The number of data points relevant to a security practitioner is much larger than it is for an average consumer. Not only do we care about time based and location-based aspects of our data (physical as well logical, like IP address mapping), but also about literally hundreds of relevant factors, like privileges, configurations, infrastructure, policies, business functions, and so forth. It is variety that makes centrally tracking and making use of all of that data is extremely difficult with current tools. That’s why your Compliance team’s favorite tool may still be the spreadsheet.

In IT Operations and in InfoSec, we painfully lack an effective Universal Schema that can cover all these data points and allow us to ask any questions we have, for any scenario. Forget identity and threat modeling on top of event data. What we need is something that is able to understand and map top-to-bottom ***any ***IT data, from binary memory dumps to business drivers and risks, and everything in between.

This is exactly why I’m fired up about the work we’re doing on the open-world data ontology for Gemini Enterprise, [https://www.geminidata.com/products/gemini-enterprise/] delivered in no small part thanks to Conrad Constantine, our resident Security Data Strategist. It works by breaking down technical information to its principal meaning, and by understanding the relationships within the data, it builds a foundational knowledge layer for all of the IT. The “open-world” part means it’s not built for any particular single purpose, and is capable of growing in any direction, including areas we haven’t yet thought of, without impact to performance. Because of this, the data we collect can drive an incredible number of use cases, and is useful to any IT professional, from a cybersecurity analyst to a technical leader.

If you ask me if our solution [https://www.geminidata.com/products/gemini-enterprise/] could theoretically help with correlating your favorite bars with nice hotels, I would say that our schema can certainly support it.