27 January 2021

A lifesaver: Prevent AWS SNS subscriptions to become 'floating'

TL;DR

When an SNS topic is (accidentally) deleted while still holding subscriptions, those subscriptions will become ‘floating’ and won’t receive traffic any more. It’s an issue without an easy fix with potentially quite some impact. This article offers a solution to monitor and prevent this from happening.

The Pattern

Those familiar with AWS SNS know that it’s a common pattern to have an SNS topic living in one account while the subscription resides in another. In such a case you have a shared responsibility model that looks as follow:

SNS Topic

These are the responsibilities for each resource:

  • Topic owner (Account X): responsible for managing the SNS topic and the topic policy to allow subscriptions (from other accounts).
  • Subscribers (Account Y) manage the subscription itself. Subscriptions can exist in accounts that differ from the topic account. The subscriber is fully responsible for managing the subscription.

What are ‘Floating’ subscriptions?

If you use SNS then chances are big that you already stumbled upon the phenomenon that I like to call ‘floating’ subscriptions. You see, when a topic is (accidentally) deleted without removing all its subscriptions first, the remaining subscriptions will become ‘floating’. What this means is that a subscription will keep existing but it won’t receive traffic any more. This is because from that moment, the subscription is no longer connected to a topic.

Floating subscription

Side note: especially when using Infrastructure as Code it’s sometimes easy to mispredict the outcome of code changes. Renaming a resource in CloudFormation could lead to a complete resource replacement in the background. That, on its turn, can result in nasty side effects (like a lost subscription). Most of the time that is something you learn the hard way. 😉

Isn’t this awkward behaviour?

The first time I encountered a ‘floating’ subscription, I couldn’t believe what was happening. Diving deeper, I was astonished to discover it is expected behaviour. However, as soon as you give it a second thought, it’s easy to understand why the behaviour is implemented this way. Especially in a multi-account setup, it would be unimaginable that a resource (in this case a subscription) you don’t own would get deleted from an account you don’t manage. So, the best AWS can do when you delete a topic with subscriptions attached, is to simply cut the connected wires.

The impact

The venom of a ‘floating’ subscription is that it often stays under the radar for quite some time.

In case of accidental topic deletion, things are even worse. There is no magic fix to rewire floating subscriptions, on the contrary, there is no way to query subscriptions linked to nonexistent topics. On top of that, you will need the help of all consumers to recreate the subscriptions in order to get you out of the swamp.

A safety net

First of all, a consumer should always monitor the number of incoming messages for his subscriptions. If possible he should also create an alert when the amount of incoming messages drops to zero for a certain period. As a topic owner, it’s a good habit to create this awareness whenever you open a topic policy to allow subscriptions.

The best way to prevent ‘floating’ subscriptions is to harden topic deletion. To achieve this add a deny-delete to a topic’s policy (this blocks IaC, CLI and the Web Console):

"PolicyDocument": {
  "Version": "2008-10-17",
  "Statement": [
    ...
    {
      "Sid": "DenyDeleteTopic",
      "Effect": "Deny",
      "Principal": {
        "AWS": "*"
      },
      "Action": [
        "sns:DeleteTopic"
      ],
      "Resource": "*"
    }
    ...
  ]
}

In CloudFormation this looks like this:

SomeSnsTopicPolicy:
Type: AWS::SNS::TopicPolicy
Properties:
  Topics:
    - !Ref SomeSnsTopic
  PolicyDocument:
    Id: SomeSnsTopicPolicy
    Version: '2012-10-17'
    Statement:
      - Sid : deny-delete-topic
        Effect: Deny
        Principal:
          AWS: "*"
        Resource: "*"
        Action: sns:DeleteTopic

If you use CloudFormation there are two other possibilities. These solutions also make it hard to delete a topic from within CloudFormation but they offer no protection for deletion triggered by the Web Console or CLI.

Option 1: prevent update and deletion using an UpdateReplacePolicy and DeletionPolicy attribute on CloudFormation resources.

SomeSnsTopic:
Type: AWS::SNS::Topic
UpdateReplacePolicy: Retain
DeletionPolicy: Retain
Properties:
  DisplayName: Some SNS Topic
  TopicName: some-topic-name

Option 2: prevent update and deletion using a stack policy.

{
  "Statement" : [
    {
      "Effect" : "Deny",
      "Action" : ["Update:Replace", "Update:Delete"],
      "Principal": "*",
      "Resource" : "*",
      "Condition" : {
        "StringEquals" : {
          "ResourceType" : ["AWS::SNS::Topic"]
        }
      }
    },
    {
      "Effect" : "Allow",
      "Action" : "Update:*",
      "Principal": "*",
      "Resource" : "*"
    }
  ]
}

So, that’s a lot safer now 😄

Enjoy and until next time!