Surviorship Bias and Infosec

A great tweet in my feed recently tied surviorship bias with IT security:

Books on cognition and cognitive biases often reference this somewhat famous bit of thinking by a man named Abraham Wald.  If you are interested, you can read more details about Wald’s observation in situation here.

In this case, planes with certain areas damaged never made it back, and so no (or few) planes analyzed had damage to those areas.  All of the damage observed happened on parts of the plane that could be damaged while allowing the plane to return.

Nassim Taleb called this “silent evidence” in his book “Black Swan”.  Arthur Conan Doyle paid a bit of homage to this phenomenon in to the plot of a Sherlock Holmes story where someone was murdered, but upon questioning, the neighbors didn’t recall hearing the victim’s dog barking.  The absence of a barking dog in this Homles’ story indicated that the perpetrator was someone the dog knew well.

As Florian points out in his tweet, we often react to the data we have, rather than being aware that we are unaware of missing data when we form a hypothesis or make a recommendation, such as investing in a new WAF in the face of a dearth of logs indicating attacks on a web server.  It’s a great point: the decision to buy a WAF is made without the benefit of knowing which of the myriad other attack vectors are being used, possibly successfully, against the organization, because there is no log information.

This raises a complex question: how do we know what we don’t know?  Ultimately, we as security people, have to take action and don’t have the luxury of philosophizing on the nature of uncertainty.  We must make decisions under uncertainty, often quickly.  What to do?

Here is the way I would approach this: I may have logs indicating persistent attacks on our web site, but the question I would ask is whether we have evidence that any of those attacks are successful, or will likely be successful.  There’s nothing surprising about a public web site being attacked – everything that has an internet routable IP address is constantly being scanned, probed, infected, and/or compromised.  Since I do not have unlimited money to spend on security, I have to assess whether the web server is the most pressing thing to address.  I need to consider what data I’m missing.  In this case, I’m missing all of the other data about all other types of attacks that might be happening.  Am I able to detect those attacks?  If not, would I know if any were successful?

When approaching a problem like this, it’s good to start with the basics.  If I am not able to detect various types of attacks, it’s tough to prioritize where to implement control enhancements, therefore a good place to start is with improving visibility using tools such as EDR, antivirus, IDS, and so on, depending on the situation.  It’s been my experience that many organizations are simply unable to detect successful attacks, and so live blissfully ignorant of their data walking out the door.  The act of enhancing visibility into such attacks often identifies serious problems that need to be addressed quickly.  It’s at this point that I can compare the importance of a WAF against some other control.

Enhancing visibility doesn’t (necessarily) lead to improved controls as quickly as, say, running out and buying a WAF, but the visibility enhancements will help with prioritization and with building a business case for funding additional security controls.  The investment in visibility is not throw-away: even after new preventive controls are in place, the ability to detect malicious activity is still vitally important, and can help refine other controls or identify shifts in adversarial tactics.

One problem I’ve experiences with improving visibility during my career is that, at least for a period of time, the perceived security of a company seems to take a serious turn for the worse because I’m detecting things that were happening previously, but which we simply didn’t know was happening.  Any time we engage in such a program, it’s important to set expectations appropriately.

For more reading on this, I recommend the following books:

 

Opportunity in Adversity

My wife and I drove from our home in Atlanta to Panama City, Florida yesterday.  It’s been approximately 2 months since Hurricane Michael ripped through this part of Florida.  We are here to deliver Christmas presents we and our friends, neighbors, and coworkers donated. 

I’ve seen the aftermath of fires, floods, and tornadoes many times.  What I saw was beyond anything I have experienced.  In one neighborhood we visited, nearly every house on the block had blue tarps on the roofs.  The homeowner we spoke with said she felt lucky because the all of the houses on the next block were gone.  Simply gone.  I saw houses torn in half and entire forests of trees all snapped halfway up.  Many buildings in the area have one or more exterior walls blown out, as if a bomb went off inside.  This apparently happens when wind found a way in on the other side of the building.  This damage this goes on for miles, and miles. I’ve been told that the area I visited, while bad, was not the worst hit by a long shot because it was on the western side of Michael’s eye, meaning that the winds blew out to sea.  The area to the east not only had roughly the same winds, but also massive storm surge from the wind blowing the Gulf of Mexico inland. 

From what I saw, older structures and trees suffered most, which is not terribly surprising.  I was struck by the metaphor, albeit on a much different level of significance, that this situation has with information technology.  Buildings designed and constructed 30 or 40 years ago are not designed to the same standards as those built along Florida’s coast are today.  As storms pass through, the older structures can be destroyed, as many were in Hurricane Michael. 

I see a similar story unfold with corporate IT.  Older environments are often not designed to withstand the attacks leveled at them today.  IT environments designed today will not withstand attacks in five or ten years.  Upgrading these environments to withstand those attacks is often prohibitively expensive, at least as assessed prior to a devastating attack.

We seem to be in a situation where all but the most forward looking organizations wait until a storm comes to force the investment needed to modernize its IT.  The challenge, as we repeatedly see, is that the ultimate victims harmed in such attacks is not the so much the organization itself, but rather the people whose data the organization holds.  Because of that, the calculus performed by organizations seems to favor waiting, either knowingly or unknowingly, for the storm that forces structural enhancements to their IT environments.


Thoughts About Counter-Forenics and Attacks on Logs

This morning, I read this story on ZDNet about a report from Carbon Black.  The report indicates that 72% of Carbon Black’s incident response group reported working on cases where the adversary destroyed logs.  Generally, such stats aren’t particularly insightful for a variety of reasons[1], however it should be intuitive that an adversary has a vested interest in obscuring his or her illicit activities on a compromised system.

The CIS Top 20 Critical Cyber Security Controls control number 6 touches on this point by recommending systems send logs to a central log collector, but the intention is more on log collection for the purpose of aggregation and monitoring, such as with a SIEM, rather than for tamper resistance, though that is a likely side effect.  Sending logs to a remote system is a good way to ensure proper logs exist to analyze in the wake of a breach.  Also note that, in addition to deleting locally stored logs, many adversaries will disable a system’s logging service to prevent new logs from being stored locally or sent to a log collector.

Here are a few recommendations on logging:

  1. Send system logs to a log collector that is not part of the same authentication domain as the systems generating the logs.  For example, the SIEM system(s) collecting/monitoring logs should not be members of the same active directory domain as those that generate the logs.
  2. Configure the SIEM to alert on events that indicate logging services were killed (if possible).
  3. Configure the SIEM to generate an alert after a period of inactivity from any given log source.

 

[1] I need to write a blog post on the problems with reports that are based on surveys of a population.  For now, I’d encourage you to read up on these problems yourself.  It’ll make you a better consumer and a better person.

NCSAM Day 22: To the Cloud!

Yesterday, I wrote about the importance of continued learning to stay relevant.  A specific area I want to highlight is cloud computing.  For better or for worse, cloud computing is the future of IT for most organizations, and as I wrote earlier, the cloud is not magical and brings new security challenges for us to solve.  At the same time, cloud computing brings fundamentally new security and recovery capabilities that simply weren’t practical or possible in traditional infrastructure.  We have an opportunity to not only help our organizations embrace this transformational technology, but also to make some substantial security enhancements as well.  To do this, though, we need to deeply understand cloud computing.

 

Some good places to start are:

NCSAM Day 17: Inventory Your Components

An often-overlooked aspect of vulnerability management are software components that exist on a system, such as PHP, Apache Struts, and Ghostscript.  These components are often dependencies to other applications.  If the packages are installed through a normal package manager, like yum or apt, updates should be applied during periodic updates.  There are three things to be aware of, though:

  1. If a package goes end of life, like what is about to happen PHP5, updates may simply and silently stop being applied, leaving a potentially vulnerable piece of software running on a system.
  2. If a component is custom compiled, a package manager will not apply updates. Note: this is an argument in favor of using binaries provided by main stream repositories
  3. Vulnerability scans may not be able to detect vulnerabilities in such components, particularly if using unauthenticated scans.

As we move toward infrastructure-as-code, maintaining these inventories should be less taxing, since the configuration definition for systems should explicitly contain the packages installed.  If not, then you’re doing IAC wrong.

Create a list of all these components that exist in your environment, and determine what process is used to identify a vulnerability in them and ensure each is updated when necessary.  Many may be updated in the normal course of running operating system updates, while others may require manual tracking to identify when to download, compile, and install updated source code.

It’s hard to manage what you don’t know you have.

Day 14: Understand the Limitations of Security Awareness Training

We alternately hear “people are the first line of defense” or “people are the last line of defense” in cyber security.  I haven’t figured out which one is true.  Regardless, we need to understand that there are limits to the effectiveness of awareness training and that our first line of defense or our last line of defense (whichever is correct) is quite fallible.

It comes as no surprise to anyone that training humans is not like defining a rule base in a firewall.  We tell a firewall what network traffic to permit, and what to block based on attributes of the traffic.  Similarly, we train our employees on how to identify and resist various types of attacks.  Firewalls will dutifully and predictably follow the rules it was programmed with.  Humans, however, are a different story.

Let’s imagine for a moment that we have developed a perfect security awareness program.  It clearly communicates dos and don’ts, how to spot attacks, how to report problems, and so on, in a way that is memorable and engaging.  I propose that the outcome will be significantly less than perfect, because of the following factors:

  • People act irrationally under stress from things such health problems, family problems, medication, and lack of sleep
  • Any given person will act upon the same set of conditions differently based on the time of day, proximity to lunch, day of the week, and many other factors that affect his or her frame of mind at the time
  • People in a business setting generally have incentives that may, at least some of the time, run contrary to the recommendations of awareness training, such as project deadlines, management expectations, and so on.

This should tell us that awareness training is, at best, a coarse screen that will catch some problems, but allow many others to pass unimpeded.  As such, we should focus on providing awareness education that provides the biggest value, in terms of outcomes, and then focus our remaining effort on enhancing process and technical controls that are designed to provide more predictable, and repeatable security outcomes, similar to the operation of a firewall.

On a related note, I personally think it’s irresponsible to pin the safety of an organization’s systems and data on an employee recognizing that a potentially sophisticated attack.  For this reason, I think it is incumbent on us to develop and implement systems that are resilient to such attacks, and allows employees to focus on their job duties.

NCSAM Day 1: Multifactor Authentication

Enable multifactor authentication everywhere it is feasible to do so.  Where it’s not feasible, figure out how to do it anyway using, for example by using an authenticated firewall in front of a device that doesn’t support MFA.

For many years, sophisticated adversaries have leveraged legitimate credentials in their attacks.  At the same time, organizations have struggled mightily to get their employees to pick “strong” passwords through such clever devices as a password policy that includes a minimum length and a certain diversity of character types, giving rise to the infamous “Password1”.  This problem holds true for server, network, and database administrators, too.

Three shifts in the threat landscape make multifactor authentication more important than ever:

  1. The number of ways that an adversary can obtain a password continues to grow, from mimikatz, to hashcat.
  2. The techniques that were once the domain of sophisticated adversaries are diffusing into the broader cyber crime ecosystem, such as we see with SamSam.
  3. The move to borderless IT – cloud, SaaS, and so on, means that the little safety nets our firewalls once provided are all but gone. Microsoft recently announced that it is deprecating passwords in favor multifactor authentication on some of its cloud services, such as Azure Active Directory.

Multifactor authentication is not cutting edge.  This is 2018.  I first used a multifactor authenticated system in 1998 and it worked well back then.

Some gotchas to be aware of:

  • As has been widely reported, SMS-based multifactor authentication is not advised due to numerous ways adversaries can defeat it
  • Any multifactor scheme that either contains the second factor on (such as a certificate) or delivers the second factor to (such as an email) a computer that is being authenticated from is less that ideal, given that a major use case is one where the workstation is compromised. Adversaries can use certificates on the system along with a captured password to do their deeds.
  • A common adversarial technique to get around multifactor authentication is the helpdesk. Be sure to develop a reasonably secure means of authenticating employees who are having trouble and a means of providing an alternate authentication means if, for example, someone loses their phone.

P.S. Authentication is pronounced Auth-en-ti-cation, not Auth-en-tif-ication.  Thank you.

Cyber Security Awareness Month

Tomorrow starts national cyber security awareness month (NCSAM).  I’m going to take a break from my normal complaining about what does not work and attempt to write a post per day for the next month with suggestions for making improvements based on things I’ve learned the hard way.  NCSAM normally focuses on the “user” experience, but in keeping with the intent of this site, I’ll be focusing on improvements to organizational IT and IT security.  I hope that none of what I will post is new or revolutionary to anyone who is familiar with the IT security, however a reminder and some additional context never hurts.

Stay tuned…

Infosec #FakeNews

In the infosec industry, much of the thought leadership, news, and analysis comes from organizations with something to sell.  I do not believe these groups generally act with an intent to deceive, though we need to be on guard for data that can pollute and pervert our understanding of reality.  Two recent infosec-related posts caught my attention that, in my view, warrant a discussion.  First is a story about a study that indicates data breaches affect stock prices in the long run.

Here is the story: https://www.zdnet.com/article/data-breaches-affect-stock-performance-in-the-long-run-study-finds/

Here is the study: https://www.comparitech.com/blog/information-security/data-breach-share-price-2018/

Most of us who work in the security world struggle to justify the importance of new and continued investment and focus on IT security initiatives, and the prospect of a direct linkage between data breaches and stock price declines is a wonderful thing to include in our powerpoint presentations.  As humans, we are tuned to look for information that confirms our views of the world, and the results of this study seem intuitively correct to most of us.  We WANT this study to be true.

But as with so many things in this world, it’s really not true.  To the credit of the study’s authors, the study includes a section on the limitations of the study, but that really doesn’t detract from the headline, does it?  So, I propose an alternate headline: “Data Breach proves to be a boon for LNKD shareholders!”.

In addition to the issues identified in the “limitations” section, there are other confounding factors to consider:

  1. The all had data breaches.  I know that sounds dull, but consider running a study of people who lost weight, and only including in the study people who are leaving a local gym in the evening.  Do companies that experience data breaches have some other attributes in common, such weak leadership or having a culture of accepting too many risks?  Might these factors also manifest themselves in bad decisions in other aspects of the business that might result in a declining stock price?  We don’t actually know, because the only way to know for sure is through experiments that would be highly unethical, even if immensely fun.
  2. Averages don’t work very well for small data sets.  Consider the following situation
    • Company A, B, C, and D all suffer a data breach on the same day
    • Company A, B, and C all see their stock rise by 2% the week after their respective breaches
    • Company D sees it stick decline by 20% the week after its breach
    • The average decline for this group of companies is 6.5% the week after their beaches.  But that doesn’t tell the whole story, does it?

I’m not saying that breaches don’t cause stock prices to decline.  I am saying that I’ve not yet seen good evidence for that, and that is because we can’t fork the universe and run experiments on alternate realities and compare the results.  If we could, this would not be among the first experiments I’d propose.

Like a good Ponemon study, this is study is great fodder for executive meetings, but be ware that you are not on firm ground if you get challenged.  As an anecdote, I used to be a pretty active investor, and while I did not get the Ferrari, I did learn a few things:

  • I am apparently supposed to buy low and sell high, not the other way around
  • Breaches, from a pure inventor standpoint, are generally viewed as a one time charge, and (generally) do not change the underlying fundamentals of the company.  When investing in a company, it’s the fundamentals that matter, such as: are their sales going up and cost of sales going down?

 

Next, is a story about a study that indicates 90% of retailers “fail PCI”.

Here is the story: https://www.infosecurity-magazine.com/news/over-90-of-us-retailers-fail-pci/

Here is the study: https://explore.securityscorecard.com/rs/797-BFK-857/images/2018-Retail-Cybersecurity-Report.pdf

Unfortunately, the authors of this report don’t give a description of the limitations, but I think we can infer a lot about the limitations based on the type of testing this organization performs to gather the data.  That company gathers and collates open source intelligence, seemingly similar to what other players like BitSight are doing.  I would assert that the report finds that retailers are among the worst industries, based on the data this organization gathered, at patch management.  Without knowing the details of each company in the study, we can’t know whether the environment analyzed was part of the PCI DSS Cardholder Data Environment (CDE) for a given retailer.  Making an assertion that an organization who seemingly must comply with PCI DSS is violating their obligations based on a review of the organizations “digital footprint” is not appropriate.   I am not defending the organizations’ lack of patching, just that patching all of an organization’s systems is not a PCI DSS requirement, though maybe it should be.

The downside in this sort of report is that it likely “normalizes” non-compliance with PCI-DSS.  If I’m spending a tremendous amount of time, energy and money to keep my environment in the right shape for PCI, but then see that 90% of others in my sector are not doing this, how motivated with I or my management team be?  The “right” thing to do clearly doesn’t change, but this study changes our perception of what is going on in the world.

I had a math teacher in high school who told us to keep an open mind, but not so open that people throw their trash in.  Remember to maintain a healthy level of skepticism when reading infosec articles, reports, and studies… And yes, even blog posts like this one.

A Compelling Case For DevOps?

Since I first learned of its existence, I had a mental image of devops that, to me, looks like a few classes of sugar-laced kindergartners running around the playground with scissors, certain that someone would end up hurt pretty bad.  While I certainly think there is an opportunity for bad behavior, like using devops as purely a cover to reduce costs, resulting in important steps being skipped, the recent spate of vulnerabilities with Apache Struts has me wondering if NOT going the devops direction is the more risky path.

Traditionally, business applications that use components like Apache Struts tended to be pretty important to operations, and therefore changes are very measured – often allowing only a few change windows per year.  Making so few changes per year cause a few problems:

  1. When a critical vulnerability is announced, like we have with Struts, the next change window may be a long way off, and performing an interim change is politically difficult to do, and waiting becomes the path of least resistance
  2. Application teams make changes to the application environment so infrequently that testing plans may not be well refined, making a delay until the next change window the most appealing plan

In our current world, we need the agility and confidence to rapidly address critical fixes, like we continue to see with Struts, despite the complexity of environments that Struts tends to be part of.