5 Cloud Misconfigurations That Lead to Breaches

The majority of cloud security breaches are not caused by sophisticated zero-day exploits or advanced persistent threats. They are caused by misconfigurations: default settings left unchanged, permissions scoped too broadly, and basic hygiene steps skipped during deployment. According to Gartner, through 2027 at least 99% of cloud security failures will be the customer's fault, not the provider's.

These misconfigurations are not obscure edge cases. They appear in organizations of every size, from startups pushing to production for the first time to mature enterprises juggling hundreds of accounts across multiple cloud providers. The five misconfigurations below are the ones we encounter most frequently during cloud security posture assessments, and each has been directly linked to publicly reported breaches.

1. Publicly Accessible S3 Buckets and Storage Blobs

The misconfiguration that launched a thousand headlines. When an S3 bucket, Azure Blob container, or GCS bucket is configured with public read or list permissions, anyone on the internet who discovers the resource name can enumerate and download its contents. This is not a theoretical risk: the 2019 Capital One breach exposed over 100 million customer records via a misconfigured WAF that leveraged overly permissive access to S3. Countless smaller incidents go unreported every month.

Detection is straightforward. In AWS, use the S3 console's "Public" badge, run `aws s3api get-bucket-acl` and `get-bucket-policy` across all buckets, or enable AWS Config rule `s3-bucket-public-read-prohibited`. In Azure, check the container's public access level via the portal or `az storage container show`. In GCP, check IAM bindings for `allUsers` or `allAuthenticatedUsers`. Automated CSPM tools flag this immediately, but you can audit it manually in under ten minutes.

The fix is to enforce private access by default. Enable S3 Block Public Access at the account level, not just the bucket level. In Azure, disable blob public access at the storage account level. In GCP, enforce the `constraints/storage.publicAccessPrevention` organization policy. Then audit existing buckets and remediate any that were created before the guardrails were in place.

2. Overly Permissive Security Groups and Network ACLs

Security groups are the primary network-level access control in cloud environments, and they are routinely misconfigured. The most dangerous pattern is an inbound rule allowing 0.0.0.0/0 on ports like 22 (SSH), 3389 (RDP), or 3306 (MySQL). This exposes management interfaces or database ports directly to the internet, making brute-force attacks trivial. We regularly find security groups with rules like "All Traffic / All Ports / 0.0.0.0/0" created during development and never cleaned up.

To detect this, query your security groups programmatically. In AWS, run `aws ec2 describe-security-groups` and filter for rules where the CIDR is 0.0.0.0/0 or ::/0 on sensitive ports. AWS Config provides the `restricted-ssh` and `restricted-common-ports` managed rules. In Azure, use Network Watcher or query NSG rules via the CLI. The key is not just auditing once but establishing continuous monitoring so new permissive rules are flagged within minutes of creation.

Remediation means replacing broad CIDR rules with the narrowest possible source ranges, ideally referencing other security groups rather than IP addresses. Use bastion hosts or SSM Session Manager instead of exposing SSH/RDP directly. Implement a tagging policy so every security group has an owner and a review date, and automate cleanup of rules that reference 0.0.0.0/0 on management ports.

3. Unrotated Access Keys and Long-Lived Service Account Credentials

Static access keys are the cloud equivalent of a password that never expires and never requires MFA. When an IAM user's access key has been active for 180 or 365 days without rotation, the blast radius of a compromise grows with every day. Leaked keys end up in GitHub repositories, CI/CD logs, container images, and Slack messages. Once an attacker has a valid key, they can operate silently until someone notices unusual API activity, which can take weeks or months without adequate logging.

In AWS, run `aws iam generate-credential-report` and inspect the `access_key_1_last_rotated` and `access_key_2_last_rotated` columns. Any key older than 90 days should be flagged for rotation. In GCP, use `gcloud iam service-accounts keys list` and check the `validAfterTime`. In Azure, audit app registrations and their client secret expiration dates. The CIS Benchmark for each platform includes a specific control for key rotation age.

The best fix is to eliminate long-lived keys entirely. Use IAM roles with temporary credentials via STS AssumeRole in AWS, workload identity federation in GCP, and managed identities in Azure. Where static keys are unavoidable, enforce a 90-day rotation policy, store keys in a secrets manager (not environment variables or config files), and alert on any key usage from an unexpected IP range or region.

4. Disabled or Incomplete Logging and Monitoring

You cannot detect what you do not log. Yet we routinely encounter environments where CloudTrail is disabled in non-production accounts, where VPC Flow Logs are turned off to save costs, or where Azure Activity Log data is retained for only 90 days. When a breach occurs in these environments, the forensic investigation hits a dead end almost immediately. The attacker's actions are simply unrecorded.

Check logging coverage systematically. In AWS, confirm CloudTrail is enabled in every region (not just your primary region) with management and data events logged to a centralized, immutable S3 bucket. Verify that VPC Flow Logs are enabled for all VPCs. In Azure, ensure Diagnostic Settings are configured for all subscriptions and key resources, forwarding to a Log Analytics workspace. In GCP, verify that Admin Activity audit logs (enabled by default) have not been modified and that Data Access audit logs are enabled for sensitive services.

Enable logging everywhere and centralize it. Use CloudTrail Organization trails, Azure Policy for diagnostic settings, and GCP organization-level audit log configurations. Set retention to at least one year. Then build detections on top: alert on root account usage, console logins without MFA, IAM policy changes, and security group modifications. Logging without monitoring is just compliance theater.

5. Using the Default VPC for Production Workloads

Every AWS account comes with a default VPC in each region. That default VPC includes public subnets with auto-assigned public IPs, a permissive default security group, and an internet gateway already attached. Resources launched into the default VPC are often publicly accessible by default, which is the opposite of what you want for production workloads. Azure and GCP have similar default networking constructs that prioritize ease of use over security.

Detecting this is simple: inventory your resources and check which VPC or virtual network they reside in. In AWS, compare the VPC ID of each EC2 instance, RDS instance, and Lambda function against the default VPC ID for that region. Any production workload in the default VPC should be flagged immediately. AWS Config's `ec2-instances-in-vpc` rule can be adapted, or use a CSPM tool that flags default VPC usage.

The fix is to create purpose-built VPCs with private subnets, explicit route tables, restrictive NACLs, and no auto-assigned public IPs. Migrate existing workloads out of the default VPC and then delete the default VPC in every region to prevent accidental use. Enforce this via a Service Control Policy that denies resource creation in default VPCs. This is a foundational architectural decision that pays dividends in every subsequent security control you implement.

5 Cloud Misconfigurations That Lead to Breaches (And How to Spot Them)

1. Publicly Accessible S3 Buckets and Storage Blobs

2. Overly Permissive Security Groups and Network ACLs

3. Unrotated Access Keys and Long-Lived Service Account Credentials

4. Disabled or Incomplete Logging and Monitoring

5. Using the Default VPC for Production Workloads