NCSAM Day 9: The Cloud Isn’t A Magical Place

Traditional IT environments generally required the coordination of different people and different teams to turn on a new service.  There might have been a datacenter person involved, a network person, a server person, a firewall person, and an application person involved, each playing a part to install a new server, connect it to the network, install and configure the operating system, install and configure the application, and finally, open expose the application through the firewall.  Some of those functions were consolidated into the same person or team, but in most cases, each function felt ownership for their role and generally had a set of guidelines some level of competence, including knowing what questions to ask, and when to push back if something seems too risky with a planned deployment.

All of this necessarily added up to delays and inefficiencies.  Reducing or eliminating these delays are one of the many benefits that cloud computing offers: we no longer need to rack servers; installing operating systems is automated through orchestration tools; the provider offers an easy to configure software defined network; and so on.  The move to cloud reduces or eliminates many of the IT specializations, like sysadmin, network engineer, or firewall engineer.  In the cloud, those functions no longer exist as specialties, and depending on the way in which cloud is used (for example cloud native versus rehoming server images to the cloud), simply may not be required at all.

The cloud isn’t magical though, and it still requires good security practices, and those must very likely happen without the watchful eye of the delay inducing specialists.  The way that many organizations that successfully adopt the cloud, and related practices, such as devops, is using scripted processes that are designed to ensure environments are created, configured, and managed in a secure(ish) manner.

All this despite most cloud providers’ claims that their cloud is “secure”.  Hopefully it’s apparent what the providers mean, and what they don’t mean:  generally, their description as “secure” refers to the components of the cloud infrastructure that the provider is responsible for managing, and it is understood that the cloud consumer is responsible for managing and securing everything else, which is quite a lot.

Embracing cloud isn’t just saving capital expenses and laying off administrators.  The agility and speed require even tighter processes than traditional IT, but those processes can hopefully be scripted, automated, and orchestrated.  An organization moving to the cloud needs to invest in the right skills and tools to keep the environment secure.  Unfortunately, these skills are in high demand right now, but that is the tradeoff.

NCSAM Day 8: Work on your policies

In many organizations, security policies and standards are unapproachably long and complex, or are so high-level that the reader must be a security expert to fill in missing details.  Security policies, standards, processes, and procedures must be written for the people who need to follow, implement, and interpret them, not for the people that write them.  These documents need to clearly define expectations and outcomes in a way that can be understood and implemented.

For example, a policy might state “You may not copy files containing company confidential information to USB drives.”

But, what about copying those files to other types of devices, like a home NAS drive that is exposed to the Internet?  Or someone’s clever home-brew cloud backup system using an unsecured S3 bucket? Or a cell phone via Bluetooth?  And how should employees legitimately back up their data?  What happens when they need to copy confidential files to a USB drive?  Do they get to figure out the proper controls to apply?

This extends to policies that apply to IT and infosec teams, too.  Define the set of outcomes desired and the proper guard rails that need to be applied, at the appropriate level of specificity based on the type of documentation (policy, process, procedure, and so on), ensure employees are familiar with those documents, and provide help to interpret the requirements for edge cases and fold any lessons learned back into policy enhancements and FAQs.

NCSAM Day 7: Monitor Those AV Logs

Much of the security industry is pretty down on anti-virus, and for good reason: it’s not very effective at blocking many malware infections.  When installed, it is a tool in the toolbox, though, and can be quite valuable.  One major problem with AV is that it’s not always great tool to monitor because if it can detect malware, it can probably block it.   As with many things, the context is important, though.

For example, if your AV product detects and blocks an attempted infection on a workstation, that might be interesting, but likely will not result in any kind of investigation, leading one to question why AV logs should be monitored.  But if that detection happens as the result of a full scan, depending on what was detected and where it was detected, some investigation to find out what happened or wipe/reinstall the system is likely in order.

The story is a bit different on servers: if a server’s AV detects malware, regardless of when the malware was detected, investigation is likely warranted, since servers should generally not encounter malware, and if they do, something is wrong in the environment and should be investigated.  File servers are different still, since endpoints can and will copy malware laden files onto a file server, and that does not indicate that the file server itself is “under attack”, however such events should still be investigated to find and address the culprit.

I once worked on an incident where a web server was compromised.  In the analysis, we could see an adversary found a file upload and separate local file inclusion vulnerability in the web application on the server.  Upon inspecting the AV engine, we found that the AV engine dutifully detected and quarantined various versions of a web shell the adversary was uploading for several days.  Eventually, the adversary found a web shell that the AV engine didn’t detect, and the rest is history.

In summary, collect your AV logs and apply some form of analysis on them.  AV is far from perfect, but it does work at times, and we should pay attention when it does.

NCSAM Day 6: Defend Your Tools

As IT continues to commoditize and organizations drive more efficient operations, IT and security departments continue to implement and rely on automation, management, and orchestration tools.  The function of many of these tools is to manage or enhance security but may not be properly protected.  Tools like Chef, Puppet, Ansible, Vagrant, vulnerability scanners, Active Directory, and many others can provide one stop shopping for an adversary to compromise an environment due to the functionality and level of access these tools have to an IT environment.  Fortunately, we’ve not yet seen widespread exploitation of these tools, but it is happening, and I expect they will become an increasingly important target for adversaries, and likely even automated malware attacks.

The environment these tools operate in need to be resilient against attack.  Here are some guidelines for doing so:

  • Require multi-factor authentication to the operating system and any applications
  • Dedicate the system to the function
  • Prevent inbound and outbound Internet access from the servers these systems operate on and limit inbound traffic from only authorized management hosts. Inbound and outbound traffic should be allowed, as necessary, to only those devices the system needs to connect to as part of the application’s functionality and retrieve software updates, and only on the network ports required.  I *strongly* recommend such systems NOT be managed by Active Directory.
  • Monitor the systems and applications for any evidence of compromise, including file integrity monitoring and/or whitelisting, A/V logs, and firewall logs – particularly looking for unexpected inbound or outbound connection attempts.
  • Workstations that administrators use to manage these tools must similarly be secured, including:
    • Dedicating the workstation to the purpose of administering these tools – no email access, web access, Office applications, and so on.
    • Block inbound and outbound Internet access from these workstations.
    • Blocking ALL inbound network traffic
    • Limit outbound connections to only the systems being managed and that needed to retrieve software updates
    • Monitor the systems and applications for any evidence of compromise, including file integrity monitoring and/or whitelisting, A/V logs, and firewall logs – particularly looking for unexpected inbound or outbound connection attempts.

This all may seem like overkill, but it but consider the level of access these systems have and the destruction an adversary can create by abusing them.

NCSAM Day 5: Wipe that Drive

Despite our best attempts to prevent it, malware infections happen. When it does happen, we need to

respond appropriately to prevent the problem from becoming worse. In my experience, many IT personnel do not understand infections and compromises very well, and often employ very basic response techniques, such as relying on antivirus scans, or the ever popular Malwarebytes scan.  Apparently nothing can evade Malwarebytes. (Side note: despite my cynical tone, I think Malwarebytes is very good, and I pay to run it on all of my and my family’s laptops, but it’s not perfect.)

 

Depending on the nature of the infection (A subsequent post will cover this), the only sure way to remove the infection is to wipe the drive and perform a reinstall. Malware authors and intruders can employ a wide range of techniques to maintain persistence, even if the malware itself is removed. These persistence mechanisms can reinfect the system with the same or new malware, provide other forms of access to an adversary, or destroy data.

 

For this reason, the only effective way to “clean” an infected system is wiping the drive and reinstalling the OS, applications, and data from a backup. It’s important for IT staff to understand this important nuance, and treat infections with the proper diligence. There are techniques emerging that can alter hardware components, such as UEFI and drive firmware, which may render even a wipe and reinstall ineffective, but fortunately these techniques are not yet common.

 

In summary: train your IT organization on the appropriate response to malware infections, which should start with disconnecting the system from the network, then may include making a forensic copy of the affected system and its memory, and finally should generally conclude with the affected system being wiped and reinstalled.

 

NCSAM Day 4: Understanding Lateral Movement Opportunities

As discussed previously, lateral movement is an important technique of many adversaries.  I previously described using port isolation, but there many more avenues for lateral movement, particularly between servers, where port isolation may not be possible, and between systems that need to talk to each other over the network.

In the aftermath of one particularly bad breach, the IT team for the organization I was helping did not understand the potential problem that can arise from placing an active directory domain controller on an external DMZ network.  The placement of this device brought all of the benefits of AD, like single sign on, ID deactivation, privilege assignment, and so on.  But it also required certain network ports to be opened to other domain controllers on the organization’s internal networks.  Once a server was compromised in the external DMZ network, the adversary obtained administrative access that allowed connection to the domain controller located on the same network, and the credentials obtained from that domain controller and the network access to other domain controllers allowed complete compromise of the internal network.

There are many such examples where we implement some control intended to provide some security benefit, but instead creates a means for lateral movement.  Other examples are using Citrix servers as a gateway between trusted and untrusted networks.  While a compromised Citrix server may seem like a benign thing from the perspective of a workstation connecting to the server, adversaries can propagate to connecting workstations through the drive mapping of the connected workstation.

The net point is this: look at all the places that serve as a demarcation point between different zones of trust, like the firewall separating the DMZ from the internal network, or the Citrix server separating an untrusted network from a trusted one, and work to identify the means by with an adversary could move through the boundary, and then implement an appropriate fix to address that lateral movement opportunity, if one exists.

NCSAM Day 3: Watch What You Write

Many of us in the cyber security field came up through IT administration roles.  We are often troubleshooters at heart.  We look at a problem and try to figure out what the cause is to get things back up and running.  In the security world, these are useful traits…  When responding to a security incident, we are generally inhibited by the “fog of war” – we have incomplete information about what is happening and are forced to hypothesize about potential causes, sources, actors, and so on.  As we learn more about the situation, our hypotheses are refined until we know who did what, where, and how.

These skills are vital, but the can also cause problems for your organization if you are not careful.  Sometimes a security incident turns out to be a breach where data is stolen or destroyed, and that can lead to legal actions – either by government agencies or by customers, employees, or others that may be impacted by the incident.  The things we write, particularly in emails, text messages, and so on, may be discoverable in court, and our words used against our organizations.  Investigative activities performed over email, for example, may include speculation about the cause or extent of the incident, and that may turn out to be wrong, or at least an incomplete picture of the situation.  If I’m working with a team to investigate a compromised server we just learned about, I might be inclined to email an update saying “You know, I’ll bet that server team forgot to patch Apache Struts again.  That team is very understaffed and keeps missing patches.”  Hopefully, it’s not hard to see how that statement could be used as evidence that we not only knew about the problem, we actually caused it through our actions.

At the same time, we need to communicate during incidents, and we necessarily need to hypothesize about the causes.  But, we can do so without running ourselves and our organizations up the flag pole.  Here are some recommendations:

  1. Speak to your organization’s legal counsel about this topic, and if possible, set some ground rules for how to manage communications during an incident. Rules vary by jurisdiction, and I am not a lawyer in any of them, so you should see the advice of an expert in your area
  2. Do not make legal conclusions in written communications. For example, do not write “we just violated HIPAA by sending that file to the wrong person.”  There is a lot of nuance in law that we in IT land may not understand.  Instead, communicate what is known without a conclusion.  In this example, a better statement would be “We appear to have sent a file containing PHI to the wrong person.”  I am sensitive to the fact that making the harsh statement can be more motivational than the factual statement, but keep in mind that it may end up being your words printed in a media article or presented in court about the breach.
  3. Keep communication clear, concise, and free of emotion and speculation. This can be difficult to do in an incident situation, where time is short, tension is high, and we may be tempted to release some inter-office drama.  But this is not the time or place for such things.  For example, do not write “I don’t know who started it, but Jerry has already managed to opened a malicious attachment and caused an outbreak four times this month, so I’ll bet it’s him.  His management team just doesn’t care, though, because they love his hair.”  Instead, say “The infection appears to have originated from a workstation.  We will prioritize investigating sources we’ve seen in the recent past.”
  4. If and when you do need to hypothesize and speculate about causes, do so on a phone call where the issue can be discussed and resolved without leaving a potentially ugly paper trail of incorrect speculation.
  5. Above all else, we must act ethically. The intent of this post is not to provide guidance on how to hide incidents, but rather to ensure that any reviews of the incident are not contaminated with office politics, incorrect speculation, hyperbole, and IT people declaring the organization “guilty” of some bad act.

NCSAM Day 2: Network Isolation

There is a nearly endless list of ways that an adversary can compromise an organization’s workstation, from USBs in the parking lot to malware laden email attachments.  We should design out environments to account for the eventuality that one or more workstations will get compromised by an aggressive adversary.

Enabling port isolation on your wired networks and client isolation on your wireless networks limits opportunities for lateral movement between workstations.  Isolation, of course, will not prevent all lateral movement opportunities, but if implemented properly, it can significantly limit the ability for an adversary to hop from workstation to workstation across a local subnet, collecting credentials, and will force the use of potentially more noisy/easier to detect techniques.  The name of the game is making the lives of adversaries more difficult, take longer to accomplish objectives, and make more noise in doing so.

I once had a discussion with an unnamed person from an unnamed agency that told me that part of the agency’s penetration testing regiment includes connecting a drop box of the pen tester’s choosing to the agency’s wireless or wired networks (including an LTE modem for out of band access), to simulate a workstation being compromised and needing to rely on other aspects of the infrastructure to protect systems and data from further compromise.  Port isolation was part of the strategy for that agency.

The downside implementing isolation is that it requires much more deliberate design of common services, like the placement of printers and scanners.  Coincidentally, one of the other upsides to implementing isolation is that it also requires much more deliberate design of common services, like the placement of printers and scanners.

NCSAM Day 1: Multifactor Authentication

Enable multifactor authentication everywhere it is feasible to do so.  Where it’s not feasible, figure out how to do it anyway using, for example by using an authenticated firewall in front of a device that doesn’t support MFA.

For many years, sophisticated adversaries have leveraged legitimate credentials in their attacks.  At the same time, organizations have struggled mightily to get their employees to pick “strong” passwords through such clever devices as a password policy that includes a minimum length and a certain diversity of character types, giving rise to the infamous “Password1”.  This problem holds true for server, network, and database administrators, too.

Three shifts in the threat landscape make multifactor authentication more important than ever:

  1. The number of ways that an adversary can obtain a password continues to grow, from mimikatz, to hashcat.
  2. The techniques that were once the domain of sophisticated adversaries are diffusing into the broader cyber crime ecosystem, such as we see with SamSam.
  3. The move to borderless IT – cloud, SaaS, and so on, means that the little safety nets our firewalls once provided are all but gone. Microsoft recently announced that it is deprecating passwords in favor multifactor authentication on some of its cloud services, such as Azure Active Directory.

Multifactor authentication is not cutting edge.  This is 2018.  I first used a multifactor authenticated system in 1998 and it worked well back then.

Some gotchas to be aware of:

  • As has been widely reported, SMS-based multifactor authentication is not advised due to numerous ways adversaries can defeat it
  • Any multifactor scheme that either contains the second factor on (such as a certificate) or delivers the second factor to (such as an email) a computer that is being authenticated from is less that ideal, given that a major use case is one where the workstation is compromised. Adversaries can use certificates on the system along with a captured password to do their deeds.
  • A common adversarial technique to get around multifactor authentication is the helpdesk. Be sure to develop a reasonably secure means of authenticating employees who are having trouble and a means of providing an alternate authentication means if, for example, someone loses their phone.

P.S. Authentication is pronounced Auth-en-ti-cation, not Auth-en-tif-ication.  Thank you.

Cyber Security Awareness Month

Tomorrow starts national cyber security awareness month (NCSAM).  I’m going to take a break from my normal complaining about what does not work and attempt to write a post per day for the next month with suggestions for making improvements based on things I’ve learned the hard way.  NCSAM normally focuses on the “user” experience, but in keeping with the intent of this site, I’ll be focusing on improvements to organizational IT and IT security.  I hope that none of what I will post is new or revolutionary to anyone who is familiar with the IT security, however a reminder and some additional context never hurts.

Stay tuned…