How To

Cloud resource tagging strategies for your organization

Ensuring compliance, conformance, and accuracy in your organization's tagging practices

Chris Farris
7 min. read - Apr 14, 2023
Ensuring compliance, conformance, and accuracy in your organization's tagging practices

In almost every organization I talk to, resource tagging is a management priority, yet it is nearly impossible for cloud operations and builders to get it right. Developers are focused on building; they don't live in the accounting system or know the appropriate cost center for an application.

While the builder might know the technical or business owners at the creation time, organizational changes mean that information quickly becomes outdated. Operation teams are reluctant to risk production changes just to update a cost center or contact tag. Tag information becomes less and less accurate, so operation teams put less effort into making it accurate.

How can we make tagging less of a Sisyphean effort? Regularly monitoring the status of tags helps ensure they are both present, conformant, and, most importantly, accurate.

What makes a good tagging strategy?

To be successful, tagging needs to be:

  1. Compliant - e.g., does the tag exist?
  2. Conformant - e.g., does the tag value reflect the form and function expected? If the Owner tag is supposed to be an email address, is it in the form of an email address?
  3. Accurate - e.g., does the tag value accurately reflect a practical reality?

In my experience in dealing with security remediations or incident response, there are a few key tags you should have:

  • Name - What is this resource?
  • Owner - Who is the person who can make a business decision about a resource? This is critical if downtime needs to be scheduled or legal or contractual impacts may occur if the resource is compromised.
  • TechContact - Who can answer specific technical questions about a resource? While the owner can answer questions about business impact, the Technical Contact should be able to answer questions about the resource or application's behavior and configuration.
  • Application - What application does the resource belong to?
  • Environment - Is this production, development, or something in between?
  • CostCenter - How should the cost of this specific resource be allocated to break out the cloud bill?

Let's say we have an EC2 Instance that runs an AI-based video compression algorithm. Letā€™s also say the tags look like this:

Name: SonofAnton
Owner: Erlich Bachman
TechContact: bertram.gilfoyle@piedpiper.com
CostCenter: 12345
Environment: Production

This resource is not compliant with the tagging policy. It has the four required tags of Name, Owner, TechContact, Environment, and CostCenter, but it lacks an Application tag.

This resource is mostly conformant. The Name is freeform, the TechContact is in the form of an email address, and the CostCenter is a 5-digit number. The Owner tag is not conformant because it is not in the form of an email address. The Environment tag is not conformant if the allowed values are "dev, test, prod".

The accuracy of tags is where the rubber meets the road. Is 12345 an actual cost center, or was it added by an engineer who didnā€™t know what cost center to use and put in a conformant value so the pipeline would deploy? Is Erlich Bachman really the business owner? Can you reach him in an emergency while on hiatus in Tibet?

To address tagging compliance, the following query will identify all the S3 buckets that lack an Owner tag:

select
arn, title
from
aws_tagging_resource
where
arn LIKE 'arn:aws:s3:%'
and
tags ->> 'Owner' is Null

But the presence of an Owner tag is not enough. If your policy is that all contacts should be in the form of an email address, then the Owner tag of Erlich is not compliant. This query ensures that all Owner tags are in the form of an email address:

select
arn, title,
tags ->> 'Owner' as non_compliant_owner_tag
from
aws_tagging_resource
where
arn LIKE 'arn:aws:s3:%'
and
(regexp_match(tags ->> 'Owner', '[\w]@[\w]')) [ 1 ] is Null
+--------------------------+-------------+-------------------------+
| arn | title | non_compliant_owner_tag |
+--------------------------+-------------+-------------------------+
| arn:aws:s3:::fooli-test1 | fooli-test1 | Erlich |
+--------------------------+-------------+-------------------------+

Another way to ensure conformance and accuracy is to ensure the tag values are part of a canonical set of approved values. In the following two examples, we have a CSV export of our application portfolio (fooli_apm.fooli_apps) with the list of approved applications and their cost centers.

application,costcenter
MemeFactory,24601
Nuculus,31789
MemeRecommendationEngine,51961

This query returns S3 buckets that have invalid Application tag values:

select
r.arn, r.title,
r.tags ->> 'Application' as non_compliant_app_tag
from
aws_tagging_resource as r
left join fooli_apm.fooli_apps as a
on r.tags ->> 'Application' = a.Application
where
r.arn LIKE 'arn:aws:s3:%'
and a.Application is Null
+--------------------------+-------------+-----------------------+
| arn | title | non_compliant_app_tag |
+--------------------------+-------------+-----------------------+
| arn:aws:s3:::fooli-test2 | fooli-test2 | meme-factory |
+--------------------------+-------------+-----------------------+

The proper value for the meme factory is MemeFactory, not meme-factory

We can also ensure that applications and cost-center values align:

select
r.arn, r.title,
r.tags ->> 'Application' as Application,
r.tags ->> 'CostCenter' as CostCenter,
a.costcenter as CorrectCostCenter
from
aws_tagging_resource as r
left join fooli_apm.fooli_apps as a
on r.tags ->> 'Application' = a.Application
where
r.arn LIKE 'arn:aws:s3:%'
and a.costcenter != r.tags ->> 'CostCenter'
+--------------------------+-------------+-------------+------------+-------------------+
| arn | title | application | costcenter | correctcostcenter |
+--------------------------+-------------+-------------+------------+-------------------+
| arn:aws:s3:::fooli-test1 | fooli-test1 | MemeFactory | 98765 | 24601 |
+--------------------------+-------------+-------------+------------+-------------------+

Tagging Strategy or Tagging Strategies?

If you've spent time at a large organization, you know that mergers and acquisitions are a way of life. When one company acquires another, the chances of the tagging strategy of the acquired company matching that of their new corporate overlord is pretty much nil. The same problem occurs when a merger occurs. If the larger company buys a smaller company, the odds are combined entity will adopt the larger company's tagging standard. But, in the case of a merger, two companies of the same size have to reconcile their tagging strategy.

Can an enterprise maintain multiple tagging strategies? In most cases, the tags serve similar purposes. Application vs. app. TechContact vs. CreatedBy. Owner vs. BusinessOwner. Maybe one company prefers CamelCase while the other wants snake_case.

Should the business priority be to go and re-deploy all the resources to meet the new tagging standard, Or should the newly combined entity adopt a tagging Rosetta Stone? Can a company use either app, application, or Application?

This query uses the COALESCE function to select a tag based on a prioritized list of possible tag keys.

select
arn,
tags ->> 'Name' as Name,
COALESCE(
tags ->> 'ExecutiveOwner',
tags ->> 'executive_contact',
tags ->> 'Owner'
) as BusinessOwner,
COALESCE(
tags ->> 'TechnicalContact',
tags ->> 'tech_contact',
tags ->> 'created_by'
) as TechnicalContact,
COALESCE(
tags ->> 'Application',
tags ->> 'application',
tags ->> 'app'
) as Application,
COALESCE(
tags ->> 'CostCenter',
tags ->> 'cost_center',
tags ->> 'billing_code'
) as CostCenter,
_ctx ->> 'connection_name' AS AccountName
from
aws_tagging_resource;

Take back your tagging

Often organizations change as frequently as their cloud resources. Tagging standards come and go based on business needs. Important contacts move on or are reassigned. Regularly reviewing your tag keys and values is one way to tame the endless frustration of tagging. Pulling this data and making it available to management or finance is just one way Steampipe can help.