When federal prosecutors charged a Seattle woman with stealing data from more than 100 million credit applications this week, the security of the Capital One AWS environment became the immediate focus of the media landscape.

According to the court filing and various media reports, the attack vector was orchestrated from a compromised server due to a misconfigured web application firewall (WAF). Ephemeral AWS credentials were extracted from the instance role and used to raid data from S3 buckets. The attack took place on April 21st, and on July 17th an email to Capital One outlining the attack sparked an investigation.

Several things immediately stand out about this attack. Most notably:

  • The weakness identified by Capital One and throughout media was “a misconfigured firewall.” But even if that was the point of entry, a single firewall misconfiguration should not cause a security breach this vast; as failsafe security measures should catch intruders. The lack of redundancy in security indicates other systemic security issues.
  • As Capital One acknowledged, the web application firewall (WAF) role in question never made API calls, like “List Buckets” or “Sync”, until this criminal made those calls. The WAF role’s permissions should have been reviewed at creation time to make sure they fit the business purpose.
  • Nothing in the system flagged the WAF role’s behavioral change, though such warnings would’ve been possible. When a credential set suddenly begins behaving atypically – such as scanning and looting S3 buckets – it’s wholly possible to flag the behavior for review. The API-driven nature of public cloud allows you to be reactive in real-time; Amazon Macie could have caught this abnormal behavior and alerted Capital One immediately.
  • A broader security architecture review should have highlighted that extra S3 permissions and eliminated them from the role, or limited them to a WAF-logging specific bucket if truly needed.  New automation tools exist to help meet this level of compliance. Why weren’t S3 buckets filled with sensitive information on restricted access for known IP ranges only, when such a setting can be managed and continuously monitored with automated compliance tools?
  • Permissions should be regularly checked to see if they’re being used. If not, those extra permissions should be removed. Netflix recently released an open source tool to automate this effort, called RepoKid.
  • As a best practice, logging must always be enabled across all public cloud accounts, and those logs should be sent to a protected and dedicated logging account.
  • It’s imperative to have an Incident Response plan, so you know how to react to compromises before they happen.

On the last two points, Capital One was actually pretty successful. Because Capital One was proactively logging everything, the criminal’s actions were logged and available for immediate review. You can’t protect what you can’t see, and at minimum Capital One was able to look retroactively and see the exact steps taken to breach their security, allowing them to be rapid and accountable in their response, which is commendable.

In the end, the lesson from the Capital One breach should be a lesson of caution that the public cloud, while far more secure than on-premise data centers, is far from a security silver bullet. It’s imperative that the DevOps teams building your public cloud are paying attention.