Improving The Effectiveness of Vulnerability Remediation Targeting

Many organizations seem to apply a sensible heuristic to patching: patch the systems that are most exposed and valuable first, in descending order of exposure and importance.  The heuristic usually looks something like this:

  1. Internet facing systems – patch first
  2. Critical internal production systems – patch second
  3. Other production systems – patch third
  4. Development, test, and other lab systems, patch last

Workstation patching usually ends up in there somewhere, but is usually performed by a different team and different processes and so a bit orthogonal to this scenario.

The reason for this prioritization of patching is that most organizations don’t apply patches automatically to servers and other infrastructure.  Generally, even when automation is used, there is some amount of testing and sequencing applied to the process.  It makes sense to apply patches in a manner that reduces risk from greatest to least.

I’ve noticed a potential problem with this strategy in the aftermath of the MS17-010 patch in March, 2017, and in the recent Microsoft Security Advisory 4025685. 

First, organizations should be assessing whether a new vulnerability is wormable,  Generally the condition for that is remote, unauthenticated code execution over a network network on some kind of system or service that is common enough for a worm to be a threat.

Second, some consideration for the attack vector should be factored in.  If, as was the case with MS17-010, the vulnerability is in SMB (TCP/445), but none of your Internet facing systems have TCP/445 exposed, prioritizing patches for Internet facing systems over other systems likely doesn’t make sense.  Patching the most critical systems that are the most exposed to that vulnerability should be the heuristic used.  That can be complicated, though.  In the case of something like an SMB vulnerability, the most exposed servers are likely going to be those servers that are accessed by the organization’s workstations via SMB.

And certainly we should be proactively limiting our exposure by following more fundamental best practices, such as not permitting inbound TCP/445 access to systems and disabling SMBv1 in the case of MS17-010.

To sum up, a single heuristic model for prioritizing patches is sub-optimal when trying to reduce risk.  Some additional thought may be required.

P.S., MS Advisory 4025685 appears, to me anyhow, to have the potential to lead to some significant attacks in the near future.  Hopefully you are already limit TCP/445 where possible and are in process of applying the patches.

Limiting Lateral Movement Options With Port Isolation

I had a meeting with some network team members from a government entity recently.  They described a configuration where all of the network ports that workstations connect to are configured with port isolation, which prevents workstations, even on the same VLAN, from communicating with each other over the network.  This feature is available on most network switches.

There are not many use cases I am aware of where workstations need to directly connect to each other.  At least not many that we want to encourage.  Isolating systems in this way seems like a good way to limit lateral movement.  Lateral movement is limited to systems that are “upstream”, enabling a convenient opportunity to monitor for and detect such attacks.

I was initially thinking about this in the context of mitigating impact of network worms in the wake of WannaCry.  However, it seems like the utility in this extends far beyond just worms.

Never Let A Serious Cyber Crisis Go To Waste

For nearly a week now, a non-stop parade of news reports berated the UK’s NHS, who temporarily suspended operations at some hospitals due to WannaCry infections, for their continued use of Windows XP.  An example is this one.  This article points out, and many other news stories also report, that Citrix obtained a response to a freedom of information (FoI) request which indicated “that 90% of hospitals still had machines running on Windows XP”.  There is no indication of how big the problem really is, though.  If 90% the hospitals had a single XP-based ATM, that would be reported the same as if each of those hospitals ran tens of thousands of XP systems.  Indeed another report on the survey had this to say:

The trusts that provided details said Windows XP made up a small part of their overall PC estate — one said it was 50 out of 5,000 PCs, for example.

A bit of Googling reveals that Citrix sent an FoI request to 63 out of more than 200 NHS trusts, and received responses from 42 (source).  The survey was almost certainly intended for marketing purposes by Citrix, intended to help sell more of their product.  We should be wary about that stat.

Now come reports that XP systems really were not commonly infected.  Various discussions on Twitter even indicate that XP SP3 is not susceptible to WannaCry infections.

There are people who are clearly angry with the NHS for its continued use of XP, and that is probably well founded.  However, the angry stories tenuously linking XP in 90% of NHS hospitals to the services outages experienced at those NHS hospitals, apparently completely misses the point: if we want to beat up on the NHS about something related to it’s WannaCry woes, we should go after their failure to patch Win7 and/or Win2008 in a timely manner.

A Chaos Monkey For Nuking Vulnerable Systems

There has been a let of good discussion and debate on Twitter around a hypothetical “whitehat worm” that simply applies the MS17-010 patch on vulnerable systems.  This is one of the more popular threads I’ve seen:

The consensus is that it’s a terrible idea and is unethical.  That seems like the right position to take, but allow me to take the other side for the sake of discussion.  Actually, not just the other side.

Much has been written about Netflix’s Chaos Monkey, which grew into an entire Simian Army.  Chaos Monkey is designed to randomly shut down systems in Netflix’s environment, forcing discipline in developers and administrators to ensure the software and environment is robust enough to handle faults.  Hope is not a strategy, the chaos monkey cometh.

The main criticism people seem to be levying is that organizations should be permitted to run out of date software if they choose without worrying about a some dogooder coming along and applying a patch.  Also, in the case of the Wanna Cry ransomworm, a number of UK hospitals were apparently all but brought to a standstill, with some doctors allegedly expecting this to cost some number of patient lives, due to the systems and data being unavailable.

We hear about the woes of the health care field – forced to buy devices that are “stuck” on particular versions of operating systems, with little hope of updating.  Similarly, there are anecdotes of organizations unable or unwilling to apply patches on more traditional IT systems due to concerns of stability, compatibility, or possibly just labor costs.  In an interconnected world, the “bad behavior” of one organization doesn’t just impact that organization and its constituents.  It potentially impacts many others, even the Internet itself, through attacks like the Mirai botnet on Dyn back in 2016.

When systems are infected or compromised, the prudent recommendation is to reimage/rebuild/restore, and for good reason.  We typically can’t pragmatically ensure that an infected or compromised system is “clean” without reverting to a known good state.  Sadly, many organizations don’t do that, and some times get bit a second or third time by the same actor/malware.

Consider this hypothetical situation:

Rather than a “white hat” worm that silently attempts to apply a patch on vulnerable systems, some chaotic neutral person/group releases a worm that wrecks the operating system and then shuts the system down.  The system can’t be restarted without some kind of a rebuild at this point.  This situation basically turns the potential downside of the “white hat” worm into a sure thing.  There is now no question that that business systems will be damaged, and that hospital systems will be shut down and possibly people will die.

But here’s my question: think of the group releasing this worm as an Internet-wide “chaos monkey”.  How quickly would IT behaviors change?  Would they change?  Would vendors act differently?

A Modest Proposal To Reduce Password Reuse

Many of us are well aware of ongoing problem of password reuse between online services.  Billions of account records, including clear text and poorly hashed passwords, are publicly accessible to use in attacks on other services.  Verizon’s 2017 DBIR noted that operators of websites that use standard email address and password authentication need to be wary of the impact of other sites being breached on their own site due to the extensive problem of password reuse.  The authors of the DBIR, and indeed many in the security industry including me, recommend combating the problem with two factor authentication.  That is certainly good advice, but it’s not practical for every site and every type of visitor.  As an alternative, I propose that websites begin offering randomized password to those creating accounts.  The site can offer the visitor an opportunity to easily change that password to something of his or her choosing.  Clearly this won’t end password reuse outright, but it will likely make a substantial dent in it without much, if any, additional cost or complexity associated with two factor authentication.  An advantage of this approach is that it allows “responsible” sites to minimize the likelihood of accounts on their own site being breached by attackers using credentials harvested from other sites.

 

What are your thoughts?

Paying Attention To Infosec Report Statistics

I read a lot of IT security reports, both to keep me informed for my job and also to discuss on the security podcast I co-host.  These reports typically detail important attack trends which is useful to those of us who need to defend our employer’s systems by prioritizing the finite resources we have.  We have to be careful, however, when we read these reports to understand their limitations and that some times reports are direct conflict with each other.

For example, we discussed DTex Systems Insider Threat Intelligence Report on episode 189 of the podcast.  This report had an important finding:

People are the weakest security link — 60 percent of all attacks are carried out by insiders. 68 percent of all insider breaches are due to negligence, 22 percent are from malicious insiders and 10 percent are related to credential theft. Also, the current trend shows that the first and last two weeks of employment for employees are critical as 56 percent of organizations saw potential data theft from leaving or joining employees during those times.

60% of attacks are carried out by insiders.  That matches the intuition many of us have about security threats to our organization.  That sort of data is very helpful in prioritizing security investments.  From this report, I might want to invest in systems to more closely monitor employee behavior, or implement new separation of duties controls into processes, or improve background checks.  Even DTEX themselves coincidentally make a product that helps with monitoring employee activity.

Then the venerable Verizon Data Breach Investigations Report (DBIR) comes out.  The DBIR includes this nice graphic:

That’s right, 75% of breaches are perpetrated by outsiders.  How do I reconcile the two very different conclusions?  I fear that it’s not possible.  Both reports have biases.  The DTEX report indicates their data comes from analyzing risk assessments from 60 companies.  The data being analyzed appear to be limited to clients of DTEX.  Possibly all or most all of those companies had a pervasive insider threat problem and brought DTEX in to help, and so the DTEX report is based on a pool of companies that self selected with higher than average insider threat problems.  Or possibly it’s the “when you’re a hammer, everything looks like a nail” syndrome.

 

On the DBIR side, the opposite may be true.  The DBIR data comes from CERTS and many other incident responders.  It is possible that breaches that arise from insiders often may not be referred to outside help.  That certainly has been my experience over the past few decades.  Many firms do not want to air their internal dirty laundry, choosing instead to handle the investigation and any punishment as an internal matter, particularly if an employee improperly accessing data does not create a reportable breach.  If this is true, then the DBIR data would be skewed toward external sources of breach.  There are an array of other potential confounding factors to explain the differences.  Another notable hypothesis is that credential theft is a common method of entry for external actors and that the DBIR is categorized by the actual actor, not the person whose credentials were stolen.  If the DTEX report did not factor for this, then it’s possible that a percentage of the 60% of insider attacks are, in fact, outsiders using the credentials of an insider.

 

I am not intending to detract from either report.  I believe that the more data we have the better off we will be so long as we understand the limitations of what the data can tell us.  Hopefully you ask yourself questions about how much you can infer from the data when reading these reports, however sometimes it is not obvious that there is a problem until you compare two reports side by side and see the differences.  The right way to read the DTEX report, in my view, is to add the words “Of DTEX customers, 60% of all attacks are carried out by insiders” and similarly for the DBIR “Of breaches investigated by participating partners, 75% of breaches are perpetrated by external actors.”

 

The cynic in me says that the widely divergent findings of these breach reports not unwelcome by IT security leaders.  As it stands today, I can find a published report that can help me justify just about anything I may want to invest in.  If I want to invest in additional malware controls, like whitelisting, I am going to reference the DBIR in my budget requests.  If I want to invest in monitoring the activities of my employees, I’ll reference the DTEX report.  And so it goes.  There are dozens of reports from different vendors with different angles and with different findings.

 

For those of us who are looking for an unbiased view of the threat landscape to help with investment planning, the divergent findings of various reports make for a tough road.

What is with the DoublePulsar hoopla?

During the previous week, a number of security researchers took to scanning the internet for Windows systems lacking the MS17-010 patch and that are infected with the NSA’s DoublePulsar implant.  Results from various researchers seemed to vary from very few up to tens of thousands.  This article from Bleeping Computer indicates about 30,000 systems in the Internet are infected with the implant.

DoublePulsar is a non-persistent piece of malware that hooks into the SMB stack on infected systems, intercepting specially crafted SMB packets to run whatever code is sent to it.  This sort of thing is important in the context of a spying operation, where the objective is to blend in with the background and not raise suspicion.  Here is a great technical write up on DoublePulsar, in case you are interested in that sort of thing.

Here’s where I will probably get you shaking your fist at me: DoublePulsar is not the problem here.  Counting DoublePulsar infected systems is interesting, but really isn’t that informative.  The reboot after applying the patch MS17-010 drops DoublePulsar like a bad habit.  A system that is vulnerable to MS17-010 is susceptible to all manner of malware infections.  DoublePulsar itself is just a means by which to run other commands on an infected system and does not have it’s own nefarious instructions.  Metasploit, among many other offensive tools, has support for MS17-010 which allows implanting arbitrary payloads.  Here is a video of someone using MS17-010 to install meterpreter on a vulnerable system.

In my view, vulnerable Windows systems exposed to the Internet after April 14 are likely infected by something and rebuilt.

One final note: WHY, OH WHY, are there over 5.5 million Windows systems with port 445 exposed to the Internet?

Regulation, Open Source, Diversity and Immunity

When the Federal Financial Institutions Examination Council released it’s Cybersecurity Assessment Tool in 2016, I couldn’t quite understand the intent behind open source software being called out as one of the inherent risks.  

Recently, I was thinking about factors that likely impact the macro landscape of cyber insurance risk.  By that I mean how cyber insurers would go about measuring the likelihood of a catastrophic scenario that harmed most or all of their insured clients at the same time.  Such a thing is not unreasonable to imagine, given the homogeneous nature of IT environments.   The pervasive use of open source software, both as a component in commercial and other open source products and used directly by organizations, expand the potential impact of a vulnerability in an open source component, as we saw with Heartbleed, ShellShock and others.  It’s conceivable that all layers of protection in a “defense in depth” strategy contain the same critical vulnerability because they all contain the same vulnerable open source component.  
In a purely proprietary software ecosystem, it’s much less likely that software and products from different vendors will all contain the same components, as each vendor writes its own implementation.  This creates more diversity in the ecosystem, making a single exploit that impacts many  I don’t mean to imply that proprietary is better, but it’s hard to work around this particular aspect of risk given the state of the IT ecosystem.  
I don’t know if this is why the FFIEC called open source as an inherent risk.  I am hopeful their reasoning is similar to this, rather than some assumption that open source software has more vulnerabilities than proprietary software.

Asymptotic Vulnerability Remediation

I was just reading this story indicating that there are still close to 200,000 web sites on the Internet that are vulnerable to Heartbleed and recalled the persistent stories of decade old malware still turning up in honeypot logs on the SAN Internet Storm Center podcast.  It seems that vulnerability remediation must follow an asymptotic decay over time.  This has interesting implications when it comes to things like vulnerable systems being used to botnets and the like: no real need to innovate, if you can just be the Pied Piper to the many long tails of old vulnerabilities.

Also interesting to note is that 75,000 of the vulnerable devices are on AWS.  I wonder if providers, at some point, begin taking action against wayward hosting customers who are potentially putting both their platform and reputation at risk.

I’m also left wondering what the story is behind these 200,000 devices: did the startup go belly up? did the site owner die? is it some crappy web interface on an embedded device that will never get an update again?

#patchyourshit

What Does It Take To Secure PHI?

I was reading an article earlier today called “Why Hackers Attack Healthcare Data, and How to Protect It” and I realized that this may well be the one-thousandth such story I’ve read on how to protect PHI.  I also realized that I can’t recall any of the posts I’ve read being particularly helpful: mostly containing a few basic security recommendations, usually aligned with the security offerings of the author’s employer.  It’s not that the authors of the posts, such as the one I linked to above are wrong, but if we think of defending PHI as a metaphorical house, these authors are describing the view they see when looking through one particular window of the house.  I am sure this is driven by the need for security companies to publish think pieces to help establish credibility with clients.   I’m not sure how well that works in practice, but it leaves the rest of us swimming in a rising tide of fluffy advice posts proclaiming to have the simple answer to your PHI protection woes.

I’m guessing you have figured out by now that this is bunk.  Securing PHI is hard and there isn’t a short list of things to do to protect PHI.  First off, you have to follow the law, which prescribes a healthy number of mandatory, as well as some addressable, security controls.  But we all know that compliance isn’t security, right?  If following HIPAA were sufficient to prevent leaking PHI, then we probably wouldn’t need all those thought-leadership posts, would we?

One of the requirements in HIPAA is to perform risk assessments.  The Department of Health and Human Services has a page dedicated to HIPAA risk analysis.  I suspect this is where a lot of organizations go wrong, and probably the thing that all the aforementioned authors are trying to influence in some small way.

Most of the posts I read talk about the epidemic of PHI theft, and PHI being sold in the underground market, and then focus on some countermeasures to prevent PHI from being hacked.  But let’s take a step back for a minute and think about the situation here.

HIPAA is a somewhat special case in the world of security controls: they are pretty prescriptive and apply uniformly.  But we know that companies continue to leak PHI.  We keep reading about these incidents in the news and reading blog posts about how to ensure our firm’s PHI doesn’t leak.  We should be thinking about these incidents are happening to help us figure out where we should be applying focus, particularly in the area of the required risk assessments.

HHS has great tool to help us out with this, lovingly referred to as the “wall of shame”.  This site contains a downloadable database of all known PHI breaches of over 500 records, and there is a legal requirement to report any such breach, so while there are undoubtedly yet-to-be-discovered breaches, the 1800+ entries give us a lot of data to work with.

Looking through the data, it quickly becomes apparent that hacking isn’t the most significant avenue of loss.  Over half of the incidents arise from lost or stolen devices, or paper/film documents.  This should cause us to consider whether we encrypt all the devices that PHI can be copied on: server drives, desktop drives, laptop drives, USB drives, backup drives, and so on.  Encryption is an addressable control in the HIPAA regulations, and one that many firms seemingly decide to dance around.  How do I know this?  It’s in right there in the breach data.  There are tools, though expensive and onerous, that can help ensure data is encrypted wherever it goes.

The next most common loss vector is unauthorized access which includes misdirected email, physical mail, leaving computers logged in, granting excessive permissions, and so on.  No hacking here*, just mistakes and some poor operational practices.  Notably, at least 100 incidents involved email; presumably misdirected email.  There are many subtle and common failure modes that can lead to this, some as basic as email address auto-completion.  There likely is not a single best method to handle this – anything from email DLP system quarantining detected PHI transmissions for a secondary review, to disabling email address auto-complete may be appropriate, based on the operations of the organization.  This is an incredibly easy way to make a big mistake, and deserves some air time in your risk assessments.

The above loss types make up roughly 1500 of the 1800 reported breaches.

Now, we get into hacking.  HHS’ data doesn’t have a great amount of detail, but “network server” accounts for 170 incidents, and likely make up the majority of the situations we read about in the news.  There are 42 incidents each involving email and PCs.  Since there isn’t a lot of detail, we don’t really know what happened, but can infer that most PCs-related PHI leaks were from malware of some form, and most network server incidents were from some form of actual “hacking”.  The Anthem incident, for example, was categorized as hacking on a network server, though the CHS breach was categorized as “theft”.

Dealing with the hacking category falls squarely into the “work is hard” bucket, but we don’t need new frameworks or new blog posts to help us figure out how to solve it.  There’s a great document that already does this, which I am sure you are already familiar with: the CIS Top 20 Critical Security Controls.

But which of the controls are really important?  They all are.  In order to defend our systems, we need to know what systems we have that contain PHI.  We need to understand what applications are running, and prevent unauthorized code from running on our devices storing or accessing PHI.  We need to make sure people accessing systems and data are who they say they are. We need to make sure our applications are appropriately secured, and our employees are trained, access is limited properly, and all of this is tested for weakness periodically.  It’s the cost of doing business and keeping our name off the wall of shame.

* Well, there appears to be a small number of miscategorized records in the “theft” category, including CHS, and a few others involving malware on servers.