# Enrich Splunk events with Steampipe

> Splunk lookup tables can enrich AWS event data with IP-address/name mappings not available in CloudTrail. Here's how to build those tables with Steampipe.

By Chris Farris
Published: 2022-10-18


When analyzing telemetry from AWS in a security operations role, context is key. What is this instance i-7ba5bed288a? Is this random AWS-owned IP address one of mine, or does it belong to someone else? Which account is 178901234562 again? AWS doesn't provide any of this context in CloudTrail or GuardDuty.

If you use Splunk as your Security Event and Incident Management (SEIM) platform, you've probably heard of Lookups. [Per Splunk](https://docs.splunk.com/Documentation/Splunk/9.0.1/Knowledge/Aboutlookupsandfieldactions): "**Lookups** enrich your **event data** by adding field-value combinations from **lookup tables**".  Lookups can be a great way to improve your detection and investigations by adding attributes and key business context to your CloudTrail, GuardDuty, and VPC Flow Log data.

Steampipe can pull that context into [Splunk lookup tables](https://docs.splunk.com/Documentation/Splunk/8.0.4/Knowledge/ConfigureCSVlookups). In a pair of examples, for AWS Accounts and Elastic Network Interfaces, we'll use Steampipe to query AWS Accounts and public and private IP addresses of Elastic Network Interfaces. We'll save that data as Splunk lookup tables. The Steampipe SQL queries and the Splunk SPL queries provided here are examples you can build upon to create your own enrichment tables.

## Context enrichment with Steampipe
Steampipe is an open source project that provides a common interface that enables you to query cloud APIs with SQL. WIth Steampipe and the [AWS Plugin installed and configured](https://hub.steampipe.io/plugins/turbot/aws#configuration), you can easily run SQL queries against AWS APIs represented as database tables, and export the results to CSV files that load into Splunk as lookup tables.

Let's start with a simple example: a list of all [AWS Accounts](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_organizations_account) in an organization. This query ([accounts.sql](https://github.com/turbot/steampipe-samples/tree/main/all/splunk-lookup-tables/accounts.sql)) pulls the twelve-digit account id, Account Name, Status (Active or Suspended), and four specific tags on each account.

```sql
select id, name, status,
  tags ->> 'ExecutiveOwner' as Executive_Owner,
  tags ->> 'TechnicalContact' as Technical_Contact,
  tags ->> 'DataClassification' as Data_Classification,
  tags ->> 'environment' as Environment
from aws_payer.aws_organizations_account;
```
Here's the command to extract this information into a CSV file.

```bash
steampipe query accounts.sql --output csv > sp_aws_accounts.csv
```
The command extracts the account information from AWS Organizations into a CSV file that's ready to be used by Splunk.

```csv
id,name,status,executive_owner,technical_contact,data_classification
877426665359,fooli-dev,ACTIVE,Richard Hendricks,Dinesh Chugtai,Public
352894534996,fooli-security,ACTIVE,Bertram Gilfoyle,Bertram Gilfoyle,Internal
152981771857,fooli-prod,ACTIVE,Erlich Bachman,Richard Hendricks ,PersonalInformation
540147993428,fooli-payer,ACTIVE,Monica Hall,Bertram Gilfoyle,None
747037951011,fooli-memefactory,ACTIVE,Erlich Bachman,Richard Hendricks,PersonalInformation
755629548949,fooli-sandbox,ACTIVE,Erlich Bachman,Dinesh Chugtai,None
```

One of the most useful Splunk lookup tables maps from  internal or external IP addresses to the cloud resources they belong to. This Steampipe query gets all of the [ENIs](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_ec2_network_interface) in your environment. It joins that data with your [account](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_organizations_account) and [VPC](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_vpc) data to provide account and VPC names. Finally, it populates the `attached_resource` column with either the EC2 Instance ID or the ENI Description to tell you which resource each  public or private IP address belongs to.

```sql
select
  eni.network_interface_id,
  eni.private_ip_address,
  eni.vpc_id as vpc_id,
  eni.region,
  eni.status,
  eni.interface_type,
  eni.association_public_ip as public_ip,
  case
    when eni.attached_instance_id is not null
      then eni.attached_instance_id
    else eni.description
  end as attached_resource,
  vpc.tags ->> 'Name' as vpc_name,
  org.name as account_name
from
  aws_ec2_network_interface as eni,
  aws_vpc as vpc,
  aws_payer.aws_organizations_account as org
where vpc.vpc_id = eni.vpc_id
  and org.id = eni.account_id;
```

Run `steampipe query eni.sql --output csv > sp_eni.csv` to generate a CSV file like:
```csv
network_interface_id,private_ip_address,vpc_id,region,status,interface_type,public_ip,attached_resource,vpc_name,account_name
eni-0bb119c9b8271e13c,10.12.15.127,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,<null>,arn:aws:ecs:us-east-1:747037951011:attachment/2d6f0e6e-4e26-4dea-ae2f-7f0eac27d471,Prod VPC,fooli-memefactory
eni-03fb47e928c58f8c6,10.12.11.158,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,52.4.203.49,ELB app/prod-FooliApiStack/564122fa59bf64fe,Prod VPC,fooli-memefactory
eni-019de27cb07fe61f2,10.12.11.162,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,18.212.183.121,i-0a2af645e985e6aed,Prod VPC,fooli-memefactory
eni-0b0a0f4a74f50324b,10.12.15.233,vpc-0baccbf3534d6a80c,us-east-1,in-use,lambda,<null>,AWS Lambda VPC ENI-prod-FooliMailerStack-mailer-13aa6e45-264b-46cd-a91b-17cc79e7a011,Prod VPC,fooli-memefactory
eni-0d55432870ac355fc,10.12.11.119,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,54.86.186.254,i-0eeacb7ba5bed288a,Prod VPC,fooli-memefactory
eni-0dc96ad6321864543,10.12.11.151,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,34.231.250.0,RDSNetworkInterface,Prod VPC,fooli-memefactory
eni-06741026048ceb699,10.12.10.106,vpc-0baccbf3534d6a80c,us-east-1,in-use,interface,34.202.5.187,ELB app/prod-FooliApiStack/564122fa59bf64fe,Prod VPC,fooli-memefactory
eni-03831a874fce92713,10.12.10.104,vpc-0baccbf3534d6a80c,us-east-1,in-use,nat_gateway,44.196.140.136,Interface for NAT Gateway nat-0038ea0ba69861382,Prod VPC,fooli-memefactory
eni-0cb11b13ee1ded995,10.10.11.215,vpc-0c320ebb500ab616a,us-east-1,in-use,interface,3.87.77.3,i-08154eb5935852d50,Dev VPC,fooli-dev
eni-0265d9a496ff2b01e,10.10.10.218,vpc-0c320ebb500ab616a,us-east-1,in-use,interface,44.205.90.161,ELB app/dev-FooliApiStack/6fb5790bfca6ea6f,Dev VPC,fooli-dev
eni-074aac46384fe2501,10.10.10.217,vpc-0c320ebb500ab616a,us-east-1,in-use,nat_gateway,18.206.78.232,Interface for NAT Gateway nat-031b431e19aa13518,Dev VPC,fooli-dev
eni-0f45b8f63195340c5,10.10.11.66,vpc-0c320ebb500ab616a,us-east-1,in-use,interface,54.234.182.17,i-0d1bfdfc785de0619,Dev VPC,fooli-dev
eni-0bf128705f3c36d94,10.8.10.56,vpc-0da7e4ff31018985b,us-east-1,in-use,interface,<null>,i-06944d48a6262f7f9,SecurityVPC,fooli-security
```

You can now use this lookup table with either the `public_ip`, `private_ip` or the `network_interface_id`.

[Upload your file to SplunkWeb](https://docs.splunk.com/Documentation/Splunk/9.0.1/Knowledge/Usefieldlookupstoaddinformationtoyourevents) or drop the CSV outputs from Steampipe into `/opt/splunk/etc/system/lookups` on your [Splunk search heads](https://docs.splunk.com/Splexicon:Searchhead), and you're ready to start using these lookups.

## How Splunk lookups can simplify your life and help find threats in your environment.

Let's start with a simple example: Who is logging into your AWS accounts via the Web Console? Here we can decorate the ConsoleLogin events with not just the user and IP address, but also include the AWS Account's _Name_ which isn't normally available in CloudTrail.


<div style={{"marginTop":"20px","marginBottom":"0px"}}>
  <img  src="/images/blog/2022-10-splunk-lookup-tables/splunk-header.png" />
</div>

```
index="aws_cloudtrail" eventName=ConsoleLogin
| LOOKUP sp_aws_accounts.csv id AS recipientAccountId
  OUTPUT name as account_name
| table account_name, userIdentity.arn, sourceIPAddress
```
<div style={{"marginTop":"0px","marginBottom":"20px"}}>
  <img  src="/images/blog/2022-10-splunk-lookup-tables/console-logins.png" />
</div>


Steampipe-generated lookup tables also work well when you're searching VPC Flow Logs. Here we can cross-reference the FlowLog record `dvc` (device) with a lookup table of all the network devices to get the names of the resources that this target IP address was talking to.

<div style={{"marginTop":"20px","marginBottom":"0px"}}>
  <img  src="/images/blog/2022-10-splunk-lookup-tables/splunk-header.png" />
</div>

```
index=aws_flowlogs 3.236.91.34
| LOOKUP sp_eni.csv network_interface_id AS dvc
  OUTPUT attached_resource as attached_resource, vpc_name as VPC, account_name as Account
| stats count by attached_resource, VPC, Account
```
<div style={{"marginTop":"0px","marginBottom":"20px"}}>
  <img  src="/images/blog/2022-10-splunk-lookup-tables/flow-logs-example.png" />
</div>


One final example: Using ENI data to cross-reference where an EC2 Instance Role is coming from:
<div style={{"marginTop":"20px","marginBottom":"0px"}}>
  <img  src="/images/blog/2022-10-splunk-lookup-tables/splunk-header.png" />
</div>

```
index=aws_cloudtrail  "userIdentity.arn"="arn:aws:sts::*:assumed-role/payments_role/*"
| LOOKUP sp_eni.csv public_ip AS sourceIPAddress
  OUTPUTNEW attached_resource as attached_resource
| fillnull attached_resource value="External IP"
| stats count by sourceIPAddress, eventName, attached_resource
```
<div style={{"marginTop":"0px","marginBottom":"20px"}}>
  <img  src="/images/blog/2022-10-splunk-lookup-tables/roles.png" />
</div>

If a role used by an EC2 Instance is generating CloudTrail events from an IP address not part of your VPC, that's a major concern that needs to be investigated. Here we see a role that belongs to an EC2 Instance being used outside from IP Address that are not tied to known EC2 Instances.

## Extend your enrichment activities with Steampipe

AWS Accounts and elastic IP addresses are only two examples of event enrichment for your SEIM. If you have a tagging strategy for your EC2 Instances, you can query instance names and contact teams to correlate them with your endpoint detection and response (EDR) alerts.

Your organization is probably polycloud, and so is Steampipe. Plugins exist for all the major cloud providers; you can  easily  decorate your Azure activity logs and GCP audit logs.

The sample queries here were only examples of the power of Steampipe to enrich logging data. If you try this technique, [let us know](https://steampipe.io/community/join). We love hearing how practitioners use Steampipe.
