Case Study

KPIs as code: How CMD Solutions built tools for continuous controls assurance

The magic ingredient is Steampipe's ability to define information security performance metrics as SQL statements.

Jon Udell
7 min. read - Nov 30, 2022
The magic ingredient is Steampipe's ability to define information security performance metrics as SQL statements.

Ciaran Finnegan is the cybersecurity practice lead at CMD Solutions Australia and Phil Massyn is a senior security consultant there. Ciaran's help with testing, and his feedback, were key to the development of the Crowdstrike plugin. We circled back to find out what CMD Solutions are using the plugin for, and how Steampipe fits into their overall strategy.

About a year ago they began using Steampipe to scan their customers' AWS environments. "It's amazing for that," Ciaran says, "a lot better than other tools we've used." They've done 20 such engagements, and Steampipe's adoption within CMD Solutions (and its sister Mantel Group brands) is growing: "People are using it for everything because it's like a Swiss army knife, you can do lots of different things with that SQL interface."

Now Ciaran and Phil are building an internal system for what they call continuous controls assurance. Another way to say it might be KPIs as code. Here's an example of a KPI (key performance indicator):

Critical or high severity vulnerabilities are remediated within the organization's policy timeframe.

How do you translate that objective into code? With Steampipe, you do it by writing SQL queries that can join across the diverse APIs that your software stack exposes. In this case that means querying an endpoint management system, CrowdStrike, then joining with information from a workforce management system, Salesforce -- with the understanding that either or both of these may change -- to produce query results that map from a vulnerability to a device to a person.

Here's the query.

SELECT
ZTA.system_serial_number || ' (' || salesforce_krow__project_resources__c.name || ')' as resource,
CASE
WHEN ZTA.assessment ->> 'os' = '100' THEN 'ok'
ELSE 'alarm'
END AS status,
ZTA.system_serial_number || ' (' || salesforce_krow__project_resources__c.name || ' has a score of ' || (ZTA.assessment ->> 'os') as reason,
jsonb_path_query_array(ZTA.assessment_items['os_signals'], '$[*] ? (@.meets_criteria != "yes").criteria') #>> '{}' as detail
FROM
crowdstrike_zta_assessment ZTA
-- Link the serial number to the Salesforce data, so we can find the owner
-- LEFT JOIN is important, in case there isn't a link, we still want to see the data
LEFT JOIN salesforce_fixed_asset__c
ON ZTA.system_serial_number = serial_number__c
-- Here an INNER JOIN is necessary. If the serial number exists in Krow, but no owner, that could indicate a
-- a data inconsistency in Krow, which will break the query. We want an INNER JOIN, because both entries must exist
INNER JOIN salesforce_krow__project_resources__c
ON salesforce_fixed_asset__c.project_resource__c = salesforce_krow__project_resources__c.id

The tables in play are provided by the CrowdStrike and Salesforce plugins. None of the predefined Salesforce tables would have met the need, but that didn't matter because CMD Solutions were using their own custom Salesforce objects, and because the Salesforce plugin can dynamically acquire custom objects.

You can run the query in any of the ways Steampipe queries run: with the Steampipe CLI, with psql (or any Postgres CLI), with Metabase (or any Postgres-compatible BI tool), with Python (or any programming language). Or, as CMD Solutions have done, you can wrap a query in a Steampipe control that forms part of a benchmark that runs on the command line with steampipe check, or as a dashboard with steampipe dashboard.

From queries to controls and benchmarks

Here's the control that packages the query. It's just a thin wrapper that names and defines a KPI.

control "SEC_002" {
title = "SEC-002 - % of in-scope personnel compute devices with a Crowdstrike Agent Zero Trust Score for OS of 100"
sql = <<EOT
-- SQL as above
EOT
}

The control rolls up into a benchmark.

benchmark "sec" {
title = "Security"
children = [
...
control.SEC_002
...
]
}

So you can run SEC_002 individually: steampipe check control.SEC_002. Or you can run all the controls in the benchmark: steampipe check benchmark.sec. Results can flow out in a variety of formats for downstream analysis.

But first, where and how to run steampipe check in a scheduled manner? From their documentation:

steampipe-scheduled-job-runner

Run scheduled Steampipe benchmark checks securely and inexpensively on AWS using ECS Fargate. We use AWS Copilot to define Step Functions and AWS ECS Fargate scheduled jobs to run Steampipe checks in Docker. Steampipe benchmarks and controls are retrieved at run-time from a git respository to support a GitOps workflow

The job runs every night, pulls down queries from a repo, executes those against targets, and exports the outputs to S3 -- as Markdown, and as JSON that's condensed by a custom template.

Checking DMARC configuration

Here's another KPI:

All organizational email domains are configured for DMARC

And here's the corresponding query, again wrapped in a control.

control "INF_001" {
title = "INF-001 - Organisational email domains without DMARC configured"
description = "Protect against spoofing & phishing, and help prevent messages from being marked as spam. See https://support.google.com/a/answer/2466563?hl=en for more details."
sql = <<EOT
WITH ASSET_LIST as (
SELECT
D.domain,
concat('_dmarc.',D.domain) as dmarc,
COUNT(N.*) as MXCount
FROM
csv.domains D
LEFT JOIN net_dns_record N on N.domain = D.domain and N.type = 'MX'
GROUP BY
D.domain,
concat('_dmarc.',D.domain)
)
SELECT
A.domain as resource,
CASE
WHEN A.MXCount = 0 then 'skip'
WHEN N.value LIKE '%p=reject;%' THEN 'ok'
WHEN N.value LIKE '%p=quarantine;%' THEN 'ok'
ELSE 'alarm'
END as status,
CASE
WHEN A.MXCount = 0 then 'No MX record for domain ' || A.domain
WHEN N.value LIKE '%p=reject;%' THEN 'Domain ' || A.domain || ' has a reject policy.'
WHEN N.value LIKE '%p=quarantine;%' THEN 'Domain ' || A.domain || ' has a quarantine policy. Consider making it reject.'
WHEN N.value IS NULL THEN 'Domain ' || A.domain || ' has no DMARC policy defined.'
WHEN N.value LIKE '%p=none;%' THEN 'Domain ' || A.domain || ' has a dmarc policy of none.'
ELSE 'Domain ' || A.domain || ' has no DMARC policy'
END as reason,
A.domain as domain
FROM
ASSET_LIST A
LEFT JOIN net_dns_record N on N.domain = A.dmarc and N.type = 'TXT' and N.value like 'v=DMARC1%'
EOT
}

The tables here come from the CSV and Net plugins. Like Salesforce, the CSV plugin acquires tables dynamically. In this case the list of domains to check lives in a file called domains.csv retrieved from a domain name system management API. The domain names drive a join with the net_dns_record table to figure out, from MX records, which names are configured for DMARC.

Like all Steampipe controls, these report the required columns resource, status, and reason. It's purely a convention, you can write all kinds of queries against plugin-provided tables, but when you follow this convention your queries play in Steampipe's benchmark and dashboard ecosystem.

Checking for inactive user accounts

It's true that joining across APIs -- with SQL as the common way to reason over them -- is Steampipe's ultimate superpower. But you don't have to join across APIs. Many useful controls query one or several tables provided by a single plugin.

Here's one more KPI:

Inactive Okta accounts are reviewed within the organization's policy timeframes

Here's the corresponding control.

control "IAM_001" {
title = "IAM-001 - Dormant Okta accounts are accounts that have not logged on in the last 30 days"
sql = <<EOT
SELECT
U.email as resource,
CASE
WHEN U.status <> 'ACTIVE' THEN 'skip'
WHEN date_part('day', CURRENT_TIMESTAMP - U.activated) < 30 OR date_part('day', CURRENT_TIMESTAMP - U.last_login) < 30 THEN 'ok'
ELSE 'alarm'
END as status,
CASE
WHEN U.status <> 'ACTIVE' THEN 'User ' || u.email || ' is no longer active'
WHEN U.last_login is null THEN 'User ' || u.email || ' has never logged on'
WHEN date_part('day', CURRENT_TIMESTAMP - U.activated) < 30 OR date_part('day', CURRENT_TIMESTAMP - U.last_login) < 30 THEN 'Last logon was on ' || U.last_login
ELSE 'User ' || u.email || ' last logon on ' || U.last_login
END as reason,
U.email,
U.last_login
FROM
okta_user U
EOT
}

Controls like this express business logic in a clear and readable way, and require only modest SQL skill.

Next steps

As daily snapshots accumulate, Ciaran and Phil are exploring ways to visualize them and identify trends and key risk indicators (KRIs). A Python script reads the customized steampipe check output and builds JSON and Markdown outputs that flow to S3. They've built a prototype Steampipe dashboard to visualize queries, and considering how a visualization tool might help complete the picture.

Why do all this? "There are products on the market we could buy," Ciaran says, "but they don't integrate with all our services, and don't give us the granular mapping from business objectives to SQL statements. That's the magic of Steampipe for us."

We're delighted to see CMD Solutions make such excellent use of Steampipe! For more details, see the repos for their Fargate runner and their continuous controls assurance module. If you have a similar story to tell, please get in touch. We're always eager to know how people are using Steampipe.