Querying JSON
Steampipe plugins call API functions, and quite often these functions return structured object data, most commonly in json or yaml. As a result, json columns are very common in steampipe. Fortunately, PostgreSQL has native support for json. Steampipe stores json columns using the jsonb datatype, and you can use the standard Postgres JSON functions and operators with Steampipe.
To return the full json column, you can simply select it like any other column:
selecttitle,policyfromaws_s3_bucket;
You can make the json more readable with jsonb_pretty
:
selecttitle,jsonb_pretty(policy)fromaws_s3_bucket;
You can extract objects from json columns using jsonb ->
operator:
selectname,acl -> 'Owner' as ownerfromaws_s3_bucket;
Alternatively you can use array-style subscripting with Steampipe 0.14 and later:
selectname,acl['Owner'] as ownerfromaws_s3_bucket;
You can extract text from json columns using jsonb ->>
operator:
selecttitle,tags ->> 'Name' as name,tags ->> 'application' as application,tags ->> 'owner' as ownerfromaws_ebs_snapshot;
Array subscripting ALWAYS returns jsonb though, so if you want text (similar to ->>
) you will have to extract it:
selecttitle,tags['Name'] #>> '{}' as namefromaws_ebs_snapshot;
You can get text from nested objects with arrow operators:
selectname,acl -> 'Owner' ->> 'ID' as acl_owner_id,acl -> 'Owner' ->> 'DisplayName' as acl_ownerfromaws_s3_bucket;
or using array subscripting:
selectname,acl['Owner']['ID'] #>> '{}' as acl_owner_id,acl['Owner']['DisplayName'] #>> '{}' as acl_ownerfromaws_s3_bucket;
or even combining array subscripting with arrow operators:
selectname,acl['Owner'] ->> 'ID' as acl_owner_id,acl['Owner'] ->> 'DisplayName' as acl_ownerfromaws_s3_bucket;
You can use jsonpath to extract or filter data if you prefer:
selectname,jsonb_path_query(acl, '$.Owner.ID') as acl_owner_id,jsonb_path_query(acl, '$.Owner.DisplayName') as acl_ownerfromaws_s3_bucket;
You can filter, sort, and group your data using the arrow operators as well:
selecttags ->> 'application' as application,count(*) as countfromaws_ebs_snapshotwheretags ->> 'application' is not nullgroup byapplicationorder byapplication asc;
You can count the number of items in a json array:
selectvpc_endpoint_id,jsonb_array_length(subnet_ids) as subnet_id_countfromaws_vpc_endpoint;
You can enumerate json arrays and extract data from each element:
selectsnapshot_id,volume_id,jsonb_array_elements(create_volume_permissions) as permfromaws.aws_ebs_snapshot;
And even extract items within nested json in the arrays:
selectsnapshot_id,volume_id,jsonb_array_elements(create_volume_permissions) ->> 'UserId' as account_idfromaws.aws_ebs_snapshot;