Telling Lies to Machines is Easy. We have built machines that you can lie to.

At the heart of what they do, breaking into software requires the same skills as testing it for quality. Anyone who participated in a basic computer programming class has been introduced to the notion that your software must have some basic distrust of the end user to input exactly what you ask. Cast your mind to the first simple program you wrote in high school computing class and you likely remember something like this:

What is your age?

purple fatal: value error at line 12

With high-level languages, the language interpreter will catch bogus inputs like this and usually cease execution. It's up to the developer to catch these conditions and handle them with the appropriate response. Lower-level languages (like C), place far more trust in the developer, happily accepting this series of ASCII encoded digits and attempting to treat them as a number (likely just the value of 'p' : 112).

In most cases, this sort of error-checking is exactly that: checking for errors in input and monitoring for unintended behavior as a result of unexpected input. A favorite joke of Quality Assurance Engineers goes a little like this:

A QA Engineer walks into a bar and orders a beer.,100 beers, 0 beers, -1 beers, 65535 beers, Yellow beers, a goat.

Now while a QA test is looking for input that causes runtime to stop, or incorrect functionality, a system intruder is looking for unintended functionality, better known today as 'software vulnerabilities.' A developer tells software what to do. They seek ‘blind spots’ which they can use to tell the software a like that will be believed and acted upon.

This is the big difference between physical engineering and information engineering. No matter how much you insist, you are never going to be able to convince a rope bridge that it will hold the weight of your 18-wheeler truck. But the lies you tell to an information system can actually change how it functions.

When it comes to information, the same fact rings true in human or software systems: a lie is powerful, manipulative and often self-replicating. "A Lie Can Travel Halfway Around the World While the Truth Is Putting On Its Shoes" could just as much be a saying about the functionality of Network Worm Malware.

Building Machines That Know the Truth is Difficult This brings us to the great problem of modern computing - how do we build machines that know when they are being actively lied to and not just receiving incorrect, error-filled input? Ask any security analyst who has spent time working on alerts from intrusion detection systems and they'll happily weigh in on how these systems tend to be more useful for detecting connections between mismatched client/server versions and outdated software installations. Signature-based IDS can provide useful evidence to bolster an existing investigation but has a reputation for being a huge source of wild goose chases.

SIEM's and correlation rules engines gave some ability to apply descriptive logic to IDS alerts. A good example is when a host is the target of something alerted by the IDS, and soon after becomes the source of the same alert - a strong indicator that the host has been compromised and is now being used as a new source to deliver the same software exploit from.

But all of these scenarios require a human being to describe and encode the detections on a case by case basis and adapt them as rules. As the technology landscape is constantly changing, this is an ongoing challenge. Software systems inevitably have 'tunnel blindness': they can only perceive things through a limited viewpoint of their own functionality. In the earlier days of information security, when the threat landscape was far tinier, we could rely on these single scope systems much more consistently. Exploits would hit the perimeter systems and exploited them through predictable identifiable packet payloads. Simpler times.

Today, attackers know that their targets have signature detections and logging systems everywhere. Their impetus is to blend into normal activity as much as possible. To take control of legitimate credentials and to use the target's own tool against them.

One way to tackle this is to find the lie and indicate the intent. Humans are creatures of habit and our usage patterns are predictable. Activity that fells statistically outside of those patterns is reason for concern. But statistical improbability is still not a smoking gun of intent and these anomaly-driven systems have been tarred with the same reputational brush as a source of dead-end, wild good changes for security analysts. Quantitative Detection always makes the assumption that a complex conspiracy of carefully picked actions can be described in aggregate. "Find the Advanced Persistent Threat in this Bar Chart!"

"When you have eliminated the impossible, whatever remains, no matter how improbable, must be The Truth" Sherlock Holmes may be fictional, but the practice of Deductive Reasoning used by him to solve crimes is very real. This concept of Deductive Reasoning can be applied to effectively solve technology problems.

At first pass, this seems like an endorsement of purely technical solutions to determine the veracity of the intent of actions within an informational system. And yet, the realm of 'impossible' has much more permeable borders than we may at first think. Remember: this article started out by pointing out that 'possible' and 'impossible' within information systems are largely the product of consensus.

That consensus is built from a combination of shared technologies in use on your infrastructure and the purposes you to which you apply that technology. Off-the-shelf security controls have alway been hamstrung by limiting themselves to describing only what is malicious within the scope of a particular technology. You can add safety controls to a product and make an automobile that cannot go above a certain speed limit, but you can't make an automobile that won't drive onto a private parking lot.

Intent is apparent when the larger context of something is considered. If Bob logs into your company's source code control system, no off-the-shelf security control signature can indicate this to be problematic. Bob may be logging into your source control system, but what if he's doing it from a machine assigned to a person in your marketing department. Sure, anomaly detection could raise a flag on this. Something far simpler can describe it with more accuracy. This is a situation that should never happen within your policies about the operation of your business infrastructure, no anomaly detection required.

Major security breaches aren't about beating your infrastructure as the end goal. They are about acquiring your information. If attackers are using your infrastructure to acquire your information it's because they aren't just successful in lying to software, but lying to your business processes as well.

Let me cut to the chase. You cannot identify a lie without seeing it in the context in which it’s told. You cannot defend an enterprise's information by purely technical means. I can tell you from personal experience. Most major security breaches have not been identified by technical means. They’ve been identified through an employee with extensive experience with the organization's operation looking at something and thinking, "That doesn't look right. That shouldn't happen for how we do things around here."

Security controls and information systems helped raise visibility. None of them brought together this overarching context that shines the light of intent upon things. The descriptive logic of how things should be done is still locked in the heads of the business operators, unavailable to machines to operate on and compare against. It's not hard to convincingly lie to people, but it is trivially easy to lie to machines. This is what we’re setting out to change.