The Steampipe project has spawned a growing collection of open-source repositories. Currently our hub lists 136 plugins and 42 mods. We also manage contributions to the Steampipe core and the plugin SDK. At this scale and pace it's critical to standardize the configuration of our repos, and GitHub's native tooling doesn't make that easy. Equally critical: we need to track PRs and stale issues across all the repos as the number of contributions grows over time.
Happily the project has also spawned the tools we need to manage all these repos. The GitHub plugin enables queries that drive controls, benchmarks, and reports. Pipes enables us to run these on a schedule, cut snapshots to a shared team workspace, and notify the team when new snapshots appear.
The custom mods that package these queries are available for you to reuse and remix, or you can just take these ideas and run with them. Let's see how it all works.
Checking repo settings
For plugins, our checks include these conditions:
The plugin has a standard description
It uses the correct mandatory topics
It uses semantic versioning
Here, for example, is a control that checks for a standard description.
control "plugin_repo_description" {title = "Plugin repo has standard description"sql = <<-EOT(selecturl as resource,casewhen description like 'Use SQL to instantly query %. Open source CLI. No DB required.' then 'ok'else 'alarm'end as status,name_with_owner || ': ' || description as reason,name_with_ownerfromgithub_my_repositorywherename_with_owner like 'turbot/steampipe-plugin-%')union(selecturl as resource,casewhen description like 'Use SQL to instantly query %. Open source CLI. No DB required.' then 'ok'else 'alarm'end as status,name_with_owner || ': ' || description as reason,name_with_ownerfromgithub_repositorywherefull_name in (select jsonb_array_elements_text(to_jsonb($1::text[]))))EOTparam "github_external_repo_names" {description = "External repo names."default = var.github_external_repo_names}}
As with all controls, the query that drives this one reports the three required columns resource
, status
, and reason
. It's a UNION of two subqueries because the repos published to the hub are a mix of some in the turbot
GitHub organization and others maintained elsewhere by project contributors. So the first subquery uses github_my_repository to check plugin repos belonging to turbot
and the second uses github_repository to check repos belonging to contributors. The contributed repos are defined by a mod variable that's passed as a parameter to the query and instantiated as $1
in the query.
variable "github_external_repo_names" {type = list(string)description = "A list of community repositories to run checks for."default = [ "francois2metz/steampipe-plugin-ovh", "theapsgroup/steampipe-plugin-vault", "francois2metz/steampipe-plugin-scalingo", "francois2metz/steampipe-plugin-gandi", "francois2metz/steampipe-plugin-airtable", "theapsgroup/steampipe-plugin-gitlab", "ellisvalentiner/steampipe-plugin-confluence", "theapsgroup/steampipe-plugin-keycloak" ]}
Partial output from the query looks like this.
https://github.com/turbot/steampipe-plugin-slack | ok | turbot/steampipe-plugin-slack: Use SQL to instantly query users, channels, emoji and more from your Slack workspace. Open source CLI. No DB required.https://github.com/theapsgroup/steampipe-plugin-vault | ok | theapsgroup/steampipe-plugin-vault: Use SQL to instantly query Hashicorp Vault secrets, certs and more. Open source CLI. No DB required. >
Note that standard descriptions aren't all identical! The control enforces a common pattern but each plugin describes its query targets in its own way.
You can run the control individually in your console like so:
steampipe check control.plugin_repo_description
More typically, you'll want to package sets of control as benchmarks.
benchmark "plugin" {title = "Plugin Repository Checks"children = [control.plugin_repo_description,control.plugin_repo_has_mandatory_topics,control.plugin_uses_semantic_versioning,]}
There's another benchmark for mods, which checks for monotonic versioning and a different set of mandatory tags. And there's a common benchmark as well.
benchmark "common" {title = "Common Repository Checks"children = [control.license_is_apache,control.repo_homepage_links_to_hub,control.repo_wiki_disabled,control.repo_projects_disabled,control.repo_is_public,]}
You can run steampipe dashboard
and visit localhost:9194
to view benchmarks and their controls in your browser. But of course you'll want to share such results with your team. So we run the benchmarks and controls in a Pipes pipeline, capture results as snapshots, and notify a Slack channel when each snapshot completes.
Tracking open PRs and stale issues
Here's our dashboard report for open pull requests raised against plugins and mods.
There are more PRs than these, but this report focuses our attention on those from contributors outside the turbot
organization. Here's the query that finds that set of PRs. It searches for open PRs in org:turbot
excluding those from members of the organization. The GitHub plugin makes this easy to do with nothing more than basic SQL.
selectrepository_full_name as "Repository",title as "Pull Request",now()::date - created_at::date as "Age in Days",now()::date - updated_at::date as "Last Updated (Days)",author ->> 'login' as "Author",urlfromgithub_search_pull_requestwherequery = 'org:turbot is:open'and repository_full_name ~ 'turbot/steampipe-(plugin|mod)'and repository_full_name <> 'turbot/steampipe-plugin-sdk'and author ->> 'login' not in (selectm.login as member_loginfromgithub_organization_member mwherem.organization = 'turbot')order by"Age in Days" desc;
Naturally we want to see trends as well. For that we take advantage of the fact that these dashboards, running in a Pipes pipeline, produce daily snapshots. The data contained in those snapshots is accessible by way of the Turbot Pipes plugin. We use it to query the data in those snapshots and build this dashboard.
Here's the query that drives the Open Plugin and Mod Pull Requests
chart.
selectcreated_at as "Date",sum((r ->> 'Age in Days')::numeric) as "Total Days"frompipes_workspace_snapshot,jsonb_array_elements(data -> 'panels' -> 'github_tracker.table.container_dashboard_github_open_plugin_mod_pull_request_report_anonymous_container_0_anonymous_table_0' -> 'data' -> 'rows') as rwheredashboard_name = 'github_tracker.dashboard.github_open_plugin_mod_pull_request_report'group bycreated_atorder bycreated_at
The query finds all the snapshots for a dashboard, reaches into each snapshot's JSON, grabs the rows for a panel, then reports dates and the sum of Age in Days
for each date.
The dashboard that contains these charts, as well as reports like the one shown above, runs on a schedule in a Pipes pipeline.
These dashboards, packaged as mods that we install into a team workspace, have helped us focus and prioritize. And the charts show us that were making good progress!
Mix, stir, and serve hot
The ingredients of this solution include: the GitHub plugin (now even more awesome); the Pipes plugin; the plugin management, SQL, control execution, and dashboard services provided by Steampipe; and the snapshots, pipelines, notifications provided in a team workspace by Pipes. It's surprisingly easy to stir all these ingredients together, bind them with HCL and SQL, and deliver the views you need as steward of a growing community of open-source contributors. They're friendly folks, and you can meet them in our Slack community.