For Hacktoberfest we needed to add the tag hacktoberfest
to 135 repos. The tool we wrote to do that uses Steampipe to build a list of the names of those repos, then uses the native Github API (by way of Go's github.com/google/go-github) to add a topic to each repo.
Why use Steampipe? Obviously we're biased, but it wasn't a gratuitous choice. As we'll see, Steampipe isn't just a convenient way to query an API with SQL. Steampipe plugins also represent APIs in ways that dramatically simplify their use.
Our starting point was an existing tool, hacktoberfest-repo-topic-apply, which showed us how to create a native GitHub client in Go and use it to assign topics to repos. We could have adapted that tool's native-API-based method of listing repositories, then filtered the results to just those matching the pattern turbot/steampipe-(plugin|mod)
. With Steampipe that's a trivial task.
selectcount(*)fromgithub_my_repository448selectcount(*)fromgithub_my_repositorywherefull_name ~ 'turbot'186selectcount(*)fromgithub_my_repositorywherefull_name ~ 'turbot/steampipe-(plugin|mod)'135selectfull_namefromgithub_my_repositorywherefull_name ~ 'turbot/steampipe-(plugin|mod)'order byfull_nameturbot/steampipe-mod-alicloud-complianceturbot/steampipe-mod-alicloud-insightsturbot/steampipe-mod-alicloud-thrifty...
It's harder to do this with the native APIs which handle personal and organizational repos differently. Steampipe hides that underlying complexity to make things easy.
Tables that consolidate views of resources from the perspective of an authenticated user
Note that we're using the table github_my_repository. Per the documentation:
You can own repositories individually, or you can share ownership of repositories with other people in an organization. The `github_my_repository` table will list repos you own, you collaborate on, or that belong to your organizations.To query ANY repository, including public repos, use the
github_repository
table.
This is a powerful pattern that's also implemented elsewhere. The Microsoft 365 plugin provides both microsoft_my_mail_message
and microsoft_mail_message
; the Google Workspace plugin provides googleworkspace_my_gmail_message
and googleworkspace_gmail
; other plugins work the same way. Where possible, Steampipe consolidates views of resources from the perspective of an authenticated user.
As an authenticated GitHub user, my API token grants me access to my own personal repos as well as those of a half-dozen GitHub organizations I belong to. A native API query across all of them, using github.com/google/go-github
, would require me to use Repositories.List
(configured by RepositoriesListOptions
) as well as Repositories.ListByOrg
(configured by RepositoriesListByOrgOptions
, then combine the results. That's doable but not trivial. If you just want a list of names, you'd rather not have to learn those details. Steampipe's github_my_repository
hides them so you can focus on the job you're doing with those API results.
Using Steampipe in a Go program
We could have written our tool in any language but, since we'd started with hacktoberfest-repo-topic-apply
, we decided to continue with Go. At its core, the tool loops through an array of repo names and calls a function to assign a topic to each repo.
for i, repoName := range repos {addTopic(ctx, client, ownerLogin, repoName, topic)}
Our tool (see steampipe-samples) includes two functions that return a list of names as a comma-separated string; the tool splits that string to produce the array.
names = queryPostgresForRepos()// names = querySpcForRepos()var repos = strings.Split(names, ",")
The two functions use the same SQL query, and they produce the same list of names. We show both to illustrate two complementary ways you can use Steampipe in a program that's written in Go -- or in any other language such as Python, JavaScript, etc.
Use the language's database client to query Postgres in Steampipe CLI (running in service mode or Steampipe Cloud (now called Turbot Pipes).
Use the language's HTTP client to ship your query to Turbot Pipes and receive a JSON response.
Here's the SQL query that's common to both functions.
var query = `with names as (selectreplace(full_name, 'turbot/', '') as name -- replace with your ownerLoginfromgithub_my_repositorywherefull_name ~ 'turbot/steampipe-(plugin|mod)' -- replace with your ownerLogin and a patternorder byfull_name)selectarray_to_string(array_agg(name), ',') as namesfromnames
If we run the query in the Steampipe console, the output looks like this.
steampipe-mod-alicloud-compliance,steampipe-mod-alicloud-insights,...
Let's look at how the two functions acquire that output.
Using a database connection
For this approach, we used Go's database/sql
package which (we learned) also requires that you import github.com/lib/pq
for Postgres. In this example we're querying a local instance of Steampipe but could as easily query an instance running in a Turbot Pipes workspace.
func queryPostgresForRepos() string {connStr := "postgresql://judell:f3**-****-**2c@acme-jon.usea1.db.pipes.turbot.com:9193/dea4px"db, err := sql.Open("postgres", psqlconn)CheckError(err)rows, _ := db.Query(query)defer rows.Close()var names stringfor rows.Next() {err = rows.Scan(&names)CheckError(err)}return names}
If you're running locally, you'll need to start the Steampipe service (steampipe service start
) before running the program that calls this function.
Using a REST call
Here's the second method, using Turbot Pipes' query API.
func querySpcForRepos() string {type Result struct {Items []struct {Names string `json:"names"`} `json:"items"`}uri := fmt.Sprintf("https://pipes.turbot.com/api/latest/org/acme/workspace/jon/query?sql=%s", url.PathEscape(query)) //-- use your handle and workspacereq, err := http.NewRequest("GET", uri, nil)CheckError(err)req.Header.Add("Authorization", fmt.Sprintf("Bearer %s", os.Getenv("PIPES_TOKEN")))resp, err := http.DefaultClient.Do(req)CheckError(err)defer resp.Body.Close()decoder := json.NewDecoder(resp.Body)var result Resulterr = decoder.Decode(&result)CheckError(err)return result.Items[0].Names}
Go's static typing requires us to define a struct into which to decode the JSON response. In a dynamically-typed language like Python that wouldn't be necessary.
The REST call uses a bearer token that's available in your Turbot Pipes profile -- e.g https://pipes.turbot.com/user/judell/setting/token
.
If we use curl
to make that REST call:
curl -H "Authorization: Bearer ${PIPES_TOKEN}" \-d sql="with names as ( select replace(name_with_owner, 'turbot/', '') as name from github_my_repository where name_with_owner ~ 'turbot/steampipe-(plugin|mod)' order by name_with_owner ) select array_to_string(array_agg(name), ',') as names from names" \https://pipes.turbot.com/api/latest/org/acme/workspace/jon/query
Here's the result:
{"items":[{"names":"steampipe-mod-alicloud-compliance,steampipe-mod-alicloud-insights, ... steampipe-plugin-zendesk,steampipe-plugin-zoom"}]}
The function decodes that into the Result
struct and returns the Names
property of the zeroth element of Items
.
Steampipe as a software component
Steampipe is a tool that runs queries and benchmarks on the command line, powers dashboards locally or in the cloud, and embeds into CI/CD pipelines. It's also a software component that you can call from any programming language.
Why do that? As we've seen, Steampipe's plugins can abstract native APIs in a way that makes them simpler to use than the constructs provided by SDKs like github.com/google/go-github
. The Steampipe system doesn't just provide a SQL interface to an API, it organizes that interface to make you maximally productive. And it does that using patterns, like the my
pattern, that you can learn once and then apply across many APIs.
If you decide that Steampipe is the right tool for the job, which of the two methods we've shown should you use? It's your choice! If you're more comfortable using a database connection, do that. If you'd rather use the Turbot Pipes API, do that. Either way, you'll get the job done more easily than if you had to use native APIs directly. Give it a try, and let us know how it goes!