How To

Enrich Splunk events with Steampipe

Splunk lookup tables can enrich AWS event data with IP-address/name mappings not available in CloudTrail. Here's how to build those tables with Steampipe.

Chris Farris
5 min. read - Oct 18, 2022
Splunk lookup tables can enrich AWS event data with IP-address/name mappings not available in CloudTrail. Here's how to build those tables with Steampipe.

When analyzing telemetry from AWS in a security operations role, context is key. What is this instance i-7ba5bed288a? Is this random AWS-owned IP address one of mine, or does it belong to someone else? Which account is 178901234562 again? AWS doesn't provide any of this context in CloudTrail or GuardDuty.

If you use Splunk as your Security Event and Incident Management (SEIM) platform, you've probably heard of Lookups. Per Splunk: "Lookups enrich your event data by adding field-value combinations from lookup tables". Lookups can be a great way to improve your detection and investigations by adding attributes and key business context to your CloudTrail, GuardDuty, and VPC Flow Log data.

Steampipe can pull that context into Splunk lookup tables. In a pair of examples, for AWS Accounts and Elastic Network Interfaces, we'll use Steampipe to query AWS Accounts and public and private IP addresses of Elastic Network Interfaces. We'll save that data as Splunk lookup tables. The Steampipe SQL queries and the Splunk SPL queries provided here are examples you can build upon to create your own enrichment tables.

Context enrichment with Steampipe

Steampipe is an open source project that provides a common interface that enables you to query cloud APIs with SQL. WIth Steampipe and the AWS Plugin installed and configured, you can easily run SQL queries against AWS APIs represented as database tables, and export the results to CSV files that load into Splunk as lookup tables.

Let's start with a simple example: a list of all AWS Accounts in an organization. This query (accounts.sql) pulls the twelve-digit account id, Account Name, Status (Active or Suspended), and four specific tags on each account.

select id, name, status,
tags ->> 'ExecutiveOwner' as Executive_Owner,
tags ->> 'TechnicalContact' as Technical_Contact,
tags ->> 'DataClassification' as Data_Classification,
tags ->> 'environment' as Environment
from aws_payer.aws_organizations_account;

Here's the command to extract this information into a CSV file.

steampipe query accounts.sql --output csv > sp_aws_accounts.csv

The command extracts the account information from AWS Organizations into a CSV file that's ready to be used by Splunk.

877426665359,fooli-dev,ACTIVE,Richard Hendricks,Dinesh Chugtai,Public
352894534996,fooli-security,ACTIVE,Bertram Gilfoyle,Bertram Gilfoyle,Internal
152981771857,fooli-prod,ACTIVE,Erlich Bachman,Richard Hendricks ,PersonalInformation
540147993428,fooli-payer,ACTIVE,Monica Hall,Bertram Gilfoyle,None
747037951011,fooli-memefactory,ACTIVE,Erlich Bachman,Richard Hendricks,PersonalInformation
755629548949,fooli-sandbox,ACTIVE,Erlich Bachman,Dinesh Chugtai,None

One of the most useful Splunk lookup tables maps from internal or external IP addresses to the cloud resources they belong to. This Steampipe query gets all of the ENIs in your environment. It joins that data with your account and VPC data to provide account and VPC names. Finally, it populates the attached_resource column with either the EC2 Instance ID or the ENI Description to tell you which resource each public or private IP address belongs to.

eni.vpc_id as vpc_id,
eni.association_public_ip as public_ip,
when eni.attached_instance_id is not null
then eni.attached_instance_id
else eni.description
end as attached_resource,
vpc.tags ->> 'Name' as vpc_name, as account_name
aws_ec2_network_interface as eni,
aws_vpc as vpc,
aws_payer.aws_organizations_account as org
where vpc.vpc_id = eni.vpc_id
and = eni.account_id;

Run steampipe query eni.sql --output csv > sp_eni.csv to generate a CSV file like:

eni-0bb119c9b8271e13c,,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,<null>,arn:aws:ecs:us-east-1:747037951011:attachment/2d6f0e6e-4e26-4dea-ae2f-7f0eac27d471,Prod VPC,fooli-memefactory
eni-03fb47e928c58f8c6,,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,,ELB app/prod-FooliApiStack/564122fa59bf64fe,Prod VPC,fooli-memefactory
eni-019de27cb07fe61f2,,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,,i-0a2af645e985e6aed,Prod VPC,fooli-memefactory
eni-0b0a0f4a74f50324b,,vpc-0baccbf3534d6a80c,us-east-1,in-use,lambda,<null>,AWS Lambda VPC ENI-prod-FooliMailerStack-mailer-13aa6e45-264b-46cd-a91b-17cc79e7a011,Prod VPC,fooli-memefactory
eni-0d55432870ac355fc,,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,,i-0eeacb7ba5bed288a,Prod VPC,fooli-memefactory
eni-0dc96ad6321864543,,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,,RDSNetworkInterface,Prod VPC,fooli-memefactory
eni-06741026048ceb699,,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,,ELB app/prod-FooliApiStack/564122fa59bf64fe,Prod VPC,fooli-memefactory
eni-03831a874fce92713,,vpc-0baccbf3534d6a80c,us-east-1,in-use,nat_gateway,,Interface for NAT Gateway nat-0038ea0ba69861382,Prod VPC,fooli-memefactory
eni-0cb11b13ee1ded995,,vpc-0c320ebb500ab616a,us-east-1,in-use,interface,,i-08154eb5935852d50,Dev VPC,fooli-dev
eni-0265d9a496ff2b01e,,vpc-0c320ebb500ab616a,us-east-1,in-use,interface,,ELB app/dev-FooliApiStack/6fb5790bfca6ea6f,Dev VPC,fooli-dev
eni-074aac46384fe2501,,vpc-0c320ebb500ab616a,us-east-1,in-use,nat_gateway,,Interface for NAT Gateway nat-031b431e19aa13518,Dev VPC,fooli-dev
eni-0f45b8f63195340c5,,vpc-0c320ebb500ab616a,us-east-1,in-use,interface,,i-0d1bfdfc785de0619,Dev VPC,fooli-dev

You can now use this lookup table with either the public_ip, private_ip or the network_interface_id.

Upload your file to SplunkWeb or drop the CSV outputs from Steampipe into /opt/splunk/etc/system/lookups on your Splunk search heads, and you're ready to start using these lookups.

How Splunk lookups can simplify your life and help find threats in your environment.

Let's start with a simple example: Who is logging into your AWS accounts via the Web Console? Here we can decorate the ConsoleLogin events with not just the user and IP address, but also include the AWS Account's Name which isn't normally available in CloudTrail.

index="aws_cloudtrail" eventName=ConsoleLogin
| LOOKUP sp_aws_accounts.csv id AS recipientAccountId
OUTPUT name as account_name
| table account_name, userIdentity.arn, sourceIPAddress

Steampipe-generated lookup tables also work well when you're searching VPC Flow Logs. Here we can cross-reference the FlowLog record dvc (device) with a lookup table of all the network devices to get the names of the resources that this target IP address was talking to.

| LOOKUP sp_eni.csv network_interface_id AS dvc
OUTPUT attached_resource as attached_resource, vpc_name as VPC, account_name as Account
| stats count by attached_resource, VPC, Account

One final example: Using ENI data to cross-reference where an EC2 Instance Role is coming from:

index=aws_cloudtrail "userIdentity.arn"="arn:aws:sts::*:assumed-role/payments_role/*"
| LOOKUP sp_eni.csv public_ip AS sourceIPAddress
OUTPUTNEW attached_resource as attached_resource
| fillnull attached_resource value="External IP"
| stats count by sourceIPAddress, eventName, attached_resource

If a role used by an EC2 Instance is generating CloudTrail events from an IP address not part of your VPC, that's a major concern that needs to be investigated. Here we see a role that belongs to an EC2 Instance being used outside from IP Address that are not tied to known EC2 Instances.

Extend your enrichment activities with Steampipe

AWS Accounts and elastic IP addresses are only two examples of event enrichment for your SEIM. If you have a tagging strategy for your EC2 Instances, you can query instance names and contact teams to correlate them with your endpoint detection and response (EDR) alerts.

Your organization is probably polycloud, and so is Steampipe. Plugins exist for all the major cloud providers; you can easily decorate your Azure activity logs and GCP audit logs.

The sample queries here were only examples of the power of Steampipe to enrich logging data. If you try this technique, let us know. We love hearing how practitioners use Steampipe.