Using search_path to target connections and aggregators
You are probably here for one of the following reasons:
- You can't figure out why Steampipe isn't using your aggregator
- You want to run
steampipe query
or Powerpipe commands against a specific connection - You want to change your default connection
- You've seen references to the search path elsewhere, but you're not sure why it's important
- You asked what you thought was a simple question on the Steampipe Slack, and instead of an answer they sent you this link (ugh...homework...)
This guide will attempt to answer these questions in 5 minutes or less.
Schemas in Postgres
Steampipe leverages PostgreSQL foreign data wrappers to provide a SQL interface to external services and systems. The Steampipe database is an embedded PostgreSQL database.
A PostgreSQL database contains one or more schemas. A schema is a namespaced collection of named objects, like tables, functions, and views. Steampipe creates a Postgres schema for each Steampipe connection. In fact, if you query the Postgres information schema, you can get a list of the schemas in the database:
selectschema_namefrominformation_schema.schemataorder byschema_name;
Note that the schema names match your Steampipe connection names:
.inspect
The schemas, in turn, contain the foreign tables that you write queries against. Again, you can see this in the information schema:
selectforeign_table_schema,foreign_table_namefrominformation_schema.foreign_tableswhereforeign_table_schema = 'aws'
Or more simply, using the steampipe .inspect
command:
.inspect aws
In Steampipe, a plugin defines and implements a set of related foreign tables. All connections for a given plugin will contain the same set of tables.
Within a schema, table names must be unique, however the same table name can be used in different schemas. You can reference tables using a qualified name to disambiguate. A qualified name consists of the schema name and the object name, separated by a period. For example, to query the aws_account
table in the aws_prod
schema (which corresponds to the aws_prod
connection) you can refer to it as aws_prod.aws_account
:
select*fromaws_prod.aws_account
Unqualified Success
Postgres also allows you to use unqualified names:
select*fromaws_account
Note that the aws_account
table is specified, but the schema is not. If you have the same table name in multiple schemas, how does Postgres determine which table to use? As you probably guessed, this is where the schema search path comes in. The search path allows you to specify a list of schemas to be searched for the object. The first schema in the list that contains an object that matches the name will be used.
For example, assume that that search path is set to gcp_prod, azure_prod, aws_prod, aws_test
, and you run select * from aws_account
.
- Postgres will look in the
gcp_prod
schema for a table namedaws_account
, but it does not exist so it continues to the next schema in the list - Postgres will look in the
azure_prod
schema for a table namedaws_account
, but it does not exist so it continues to the next schema in the list - Postgres will look in the
aws_prod
schema for a table namedaws_account
. It finds theaws_account
table, so it runs the query against theaws_prod.aws_account
table.
Queries in Powerpipe Mods for Steampipe are written using unqualified names. This allows you to run the exact same queries, dashboards, and benchmarks against any connection, just by changing the search path!
Setting the Search Path
By default, Steampipe sets the schema search path as follows:
- The
public
schema first. This schema is writable, and allows you to create your own objects (views, tables, functions, etc). - Connection schemas, in alphabetical order by default.
- The
internal
schema last. This schema contains Steampipe built-in functions and other internal Steampipe objects. This schema is not displayed or managed by the Steampipe search path commands and options, but you'll see it in native SQL commands such asshow search_path
.
Since the connection schemas are added to the search_path alphabetically by default, the simplest way to set the default is to rename the connections. For example, let's assume that I have 3 AWS accounts and an aggregator, and I want the aggregator to be the first in the search path. I could name them as follows:
aws_prod
- Production AWS accountaws_qa
- QA AWS accountaws_dev
- Development AWS accountaws
- an aggregator of all 3 of the above AWS connections
Steampipe will add the aggregator before the other aws connections because aws
is first alphabetically:
> .search_path+-------------------------------------+| search_path |+-------------------------------------+| public,aws,aws_dev,aws_prod,aws_qa |+-------------------------------------+
If you prefer, you can explicitly set the search_path
in the database options in your ~/.steampipe/config/default.spc
file. Note that this is somewhat brittle because every time you install or uninstall a plugin, or add or remove a connection, you will need to update the file with the new search_path
.
Search Path Prefix
Setting the search_path
will replace the current search path. Usually, however, you will not want to replace the entire search path, but rather prefer a given connection. To simplify this case, set the search_path_prefix
. Setting the prefix will move the prefix to the front of the search path.
You can change the search path in your interactive terminal session with the search_path or search_path_prefix meta-commands. This will change the search path only for the current session.
You can also pass a search path or prefix to the steampipe query
command, as well as to Powerpipe commands (powerpipe server
, powerpipe bencmark run
,powerpipe dashboard run
, powerpipe control run
powerpipe query run
) to change the search path for that command. For instance, to run the CIS Benchmark against the aws_prod
connection, you can run.
powerpipe benchmark run benchmark.cis_v140 --search-path-prefix aws_prod
Tips & Tricks
- Manage your default search path with a good connection-naming strategy. For most users, this means aggregator first. With AWS, for example, use the plugin name as the name of the aggregator (e.g.
aws
), and as a prefix to the other connections (e.g.aws_prod
,aws_dev
, etc). With this approach the aggregator always comes first, even when adding and removing connections. - Use the search path prefix command or argument to modify the search path when you want to prefer a connection.
- When writing mods, use unqualified table names:
- Qualified names would require you to know the connection names, which you don't know (they are defined by the user).
- Users of your mod can vary the search path to target different connections
- If you create custom views or other objects, make sure you keep the
public
schema in your path. - Since the
public
schema is first (by default), you can create your own tables and views to use instead of the steampipe tables. If, for example, there is a table that you want to 'permanently' cache (or only manually refresh), you can create a materialized view with the same name:create materialized view aws_iam_credential_report as select * from aws_iam_credential_report
.