v0.17.0: Snapshots, share with cloud, workspace profiles →
How To

Steampipe as software component

Why, and how, to call Steampipe from any programming language.

Jon Udell
5 min. read - October 27, 2022

For Hacktoberfest we needed to add the tag hacktoberfest to 135 repos. The tool we wrote to do that uses Steampipe to build a list of the names of those repos, then uses the native Github API (by way of Go's github.com/google/go-github) to add a topic to each repo.

Why use Steampipe? Obviously we're biased, but it wasn't a gratuitous choice. As we'll see, Steampipe isn't just a convenient way to query an API with SQL. Steampipe plugins also represent APIs in ways that dramatically simplify their use.

Our starting point was an existing tool, hacktoberfest-repo-topic-apply, which showed us how to create a native GitHub client in Go and use it to assign topics to repos. We could have adapted that tool's native-API-based method of listing repositories, then filtered the results to just those matching the pattern turbot/steampipe-(plugin|mod). With Steampipe that's a trivial task.

select
count(*)
from
github_my_repository
448
select
count(*)
from
github_my_repository
where
full_name ~ 'turbot'
186
select
count(*)
from
github_my_repository
where
full_name ~ 'turbot/steampipe-(plugin|mod)'
135
select
full_name
from
github_my_repository
where
full_name ~ 'turbot/steampipe-(plugin|mod)'
order by
full_name
turbot/steampipe-mod-alicloud-compliance
turbot/steampipe-mod-alicloud-insights
turbot/steampipe-mod-alicloud-thrifty
...

It's harder to do this with the native APIs which handle personal and organizational repos differently. Steampipe hides that underlying complexity to make things easy.

Tables that consolidate views of resources from the perspective of an authenticated user

Note that we're using the table github_my_repository. Per the documentation:

You can own repositories individually, or you can share ownership of repositories with other people in an organization. The `github_my_repository` table will list repos you own, you collaborate on, or that belong to your organizations.

To query ANY repository, including public repos, use the github_repository table.

This is a powerful pattern that's also implemented elsewhere. The Microsoft 365 plugin provides both microsoft_my_mail_message and microsoft_mail_message; the Google Workspace plugin provides googleworkspace_my_gmail_message and googleworkspace_gmail; other plugins work the same way. Where possible, Steampipe consolidates views of resources from the perspective of an authenticated user.

As an authenticated GitHub user, my API token grants me access to my own personal repos as well as those of a half-dozen GitHub organizations I belong to. A native API query across all of them, using github.com/google/go-github, would require me to use Repositories.List (configured by RepositoriesListOptions) as well as Repositories.ListByOrg (configured by RepositoriesListByOrgOptions, then combine the results. That's doable but not trivial. If you just want a list of names, you'd rather not have to learn those details. Steampipe's github_my_repository hides them so you can focus on the job you're doing with those API results.

Using Steampipe in a Go program

We could have written our tool in any language but, since we'd started with hacktoberfest-repo-topic-apply, we decided to continue with Go. At its core, the tool loops through an array of repo names and calls a function to assign a topic to each repo.

for i, repoName := range repos {
addTopic(ctx, client, ownerLogin, repoName, topic)
}

Our tool (see steampipe-samples) includes two functions that return a list of names as a comma-separated string; the tool splits that string to produce the array.

names = queryPostgresForRepos()
// names = querySpcForRepos()
var repos = strings.Split(names, ",")

The two functions use the same SQL query, and they produce the same list of names. We show both to illustrate two complementary ways you can use Steampipe in a program that's written in Go -- or in any other language such as Python, JavaScript, etc.

  1. Use the language's database client to query Postgres in Steampipe CLI (running in service mode or Steampipe Cloud).

  2. Use the language's HTTP client to ship your query to Steampipe Cloud and receive a JSON response.

Here's the SQL query that's common to both functions.

var query = `
with names as (
select
replace(full_name, 'turbot/', '') as name -- replace with your ownerLogin
from
github_my_repository
where
full_name ~ 'turbot/steampipe-(plugin|mod)' -- replace with your ownerLogin and a pattern
order by
full_name
)
select
array_to_string(array_agg(name), ',') as names
from
names

If we run the query in the Steampipe console, the output looks like this.

steampipe-mod-alicloud-compliance,steampipe-mod-alicloud-insights,...

Let's look at how the two functions acquire that output.

Using a database connection

For this approach, we used Go's dabase/sql package which (we learned) also requires that you import github.com/lib/pq for Postgres. In this example we're querying a local instance of Steampipe but could as easily query an instance running in a Steampipe Cloud workspace.

func queryPostgresForRepos() string {
host := "localhost"
port := 9193
user := "steampipe"
password := os.Getenv("STEAMPIPE_LOCAL_PASSWORD")
dbname := "steampipe"
psqlconn := fmt.Sprintf("host=%s port=%d user=%s password=%s dbname=%s", host, port, user, password, dbname)
db, err := sql.Open("postgres", psqlconn)
CheckError(err)
defer db.Close()
rows, _ := db.Query(query)
var names string
for rows.Next() {
err = rows.Scan(&names)
CheckError(err)
}
return names
}

If you're running locally, you'll need to start the Steampipe service (steampipe service start) before running the program that calls this function.

Using a REST call

Here's the second method, using Steampipe Cloud's query API.

func querySpcForRepos() string {
type Result struct {
Items []struct {
Names string `json:"names"`
} `json:"items"`
}
uri := fmt.Sprintf("https://cloud.steampipe.io/api/latest/org/acme/workspace/jon/query?sql=%s", url.PathEscape(query)) //-- use your handle and workspace
req, err := http.NewRequest("GET", uri, nil)
CheckError(err)
req.Header.Add("Authorization", fmt.Sprintf("Bearer %s", os.Getenv("STEAMPIPE_CLOUD_TOKEN")))
resp, err := http.DefaultClient.Do(req)
CheckError(err)
defer resp.Body.Close()
decoder := json.NewDecoder(resp.Body)
var result Result
err = decoder.Decode(&result)
CheckError(err)
return result.Items[0].Names
}

Go's static typing requires us to define a struct into which to decode the JSON response. In a dynamically-typed language like Python that wouldn't be necessary.

The REST call uses a bearer token that's available in your Steampipe Cloud account -- for me, at https://cloud.steampipe.io/user/judell/setting/token.

If we use curl to make that REST call:

curl -H "Authorization: Bearer ${STEAMPIPE_CLOUD_TOKEN}" \
-d sql="with names as ( select replace(full_name, 'turbot/', '') as name from github_my_repository where full_name ~ 'turbot/steampipe-(plugin|mod)' order by full_name ) select array_to_string(array_agg(name), ',') as names from names" \
https://cloud.steampipe.io/api/latest/org/acme/workspace/jon/query

Here's the result:

{"items":[{"names":"steampipe-mod-alicloud-compliance,steampipe-mod-alicloud-insights, ... steampipe-plugin-zendesk,steampipe-plugin-zoom"}]}

The function decodes that into the Result struct and returns the Names property of the zeroth element of Items.

Steampipe as a software component

Steampipe is a tool that runs queries and benchmarks on the command line, powers dashboards locally or in the cloud, and embeds into CI/CD pipelines. It's also a software component that you can call from any programming language.

Why do that? As we've seen, Steampipe's plugins can abstract native APIs in a way that makes them simpler to use than the constructs provided by SDKs like github.com/google/go-github. The Steampipe system doesn't just provide a SQL interface to an API, it organizes that interface to make you maximally productive. And it does that using patterns, like the my pattern, that you can learn once and then apply across many APIs.

If you decide that Steampipe is the right tool for the job, which of the two methods we've shown should you use? It's your choice! If you're more comfortable using a database connection, do that. If you'd rather use the Steampipe Cloud API, do that. Either way, you'll get the job done more easily than if you had to use native APIs directly. Give it a try, and let us know how it goes!