Shift Left Join

Shift Left Join: Where are those IP addresses coming from?

Enrich VPC FLow Logs with geographic locations from ipstack.

Jon Udell
5 min. read - May 31, 2022
Enrich VPC FLow Logs with geographic locations from ipstack.

In this episode of #shiftleftjoin you'll learn how to enrich the IP addresses in your logs with geographic locations using the ipstack plugin. There are other ways: the Net plugin does reverse DNS lookups, the AbuseIPDB plugin looks for malicious activity associated with IP addresses, and we've elsewhere shown how the Shodan plugin scans for exploitable vulnerabilities. Here we'll focus on geographic locations; the same technique applies to the other plugins, or indeed to any plugin that maps an API that can be usefully correlated with IP addresses.

Prerequisites

We assume in this example that you have:

Step 1: List recently-seen log events

First let's take a look at recent VPC Flow Log events.

-- query q1
select
src_addr,
timestamp
from
aws_vpc_flow_log_event
where
log_group_name = 'my_cloudwatch_log_group'
and timestamp > now() - interval '1 hour'
order by
timestamp desc
+-----------------+---------------------------+
| src_addr | timestamp |
+-----------------+---------------------------+
| 45.143.203.18 | 2022-05-23T08:17:41-07:00 |
| 167.94.145.31 | 2022-05-23T08:17:41-07:00 |
| 162.142.125.179 | 2022-05-23T08:17:41-07:00 |
| 49.233.11.129 | 2022-05-23T08:16:39-07:00 |
| 193.163.125.158 | 2022-05-23T08:16:39-07:00 |
| 2.56.57.173 | 2022-05-23T08:16:39-07:00 |

Step 2. Map addresses to locations

To perform location lookups on that set of addresses, join the data with the ipstack_ip table.

-- query q2
with addrs as (
select
src_addr
from
aws_vpc_flow_log_event
where
log_group_name = 'my_cloudwatch_log_group'
and timestamp > now() - interval '1 hour'
order by
timestamp desc
)
select
a.src_addr,
i.continent_name,
i.country_name
from
addrs a
join
ipstack_ip i
on
a.src_addr = i.ip
limit
50
+-----------------+----------------+----------------+
| src_addr | continent_name | country_name |
+-----------------+----------------+----------------+
| 58.216.180.210 | Asia | China |
| 43.129.33.99 | Asia | Indonesia |
| 45.145.66.212 | Asia | Russia |
| 185.191.34.200 | Asia | Russia |
| 45.93.201.96 | Asia | Russia |
| 91.240.118.75 | Asia | Russia |
| 122.32.143.249 | Asia | South Korea |
| 78.128.113.94 | Europe | Bulgaria |
| 89.248.165.151 | Europe | Netherlands |
| 89.248.165.199 | Europe | Netherlands |

Steampipe will cache this data for 5 minutes. If you're going to be exploring it for longer than that, you might want to increase the cache TTL, or persist the data as a table or materialized view. Another reason to persist the data: ipstack lookups are metered, and you might want to conserve your lookups. Here's one way to save the results.

Step 3. Persist the data for analysis

create table ipstack_places as
-- insert query q2

The data in the ipstack_places table will, of course, soon be stale. You can drop and recreate the table to explore the latest log entries. Or you can use a Postgres materialized view.

drop table ipstack_places;
create materialized view ipstack_places as
-- insert query q2

A materialized view replays a SQL query, as does a normal view, but unlike a normal view it also persists the data. Now log_ips is effectively a read-only table. You can't alter its records but you can update the table with the command refresh materialized view ipstack_places.

Step 4: Chart the locations

To chart this data using Steampipe's dashboard subsystem, create a folder, visit it, and add this mod.sp file.

mod "ip_lookups" {
title = "IP lookups"
}
query "ipstack_continents" {
sql = <<EOQ
select
i.continent_name,
count(*)
from
ipstack_places i
group by
continent_name
}
query "ipstack_countries" {
sql = <<EOQ
select
i.country_name,
count(*)
from
ipstack_places i
group by
country_name
}
}
dashboard "ips_by_location" {
container {
chart {
width = 4
title = "IPs by continent"
type = "donut"
query = query.ipstack_continents
}
chart {
width = 4
title = "IPs by country"
type = "donut"
query = query.ipstack_countries
}
}
}

Now start the dashboard server and visit localhost:9194.

steampipe dashboard

Problem solved! Now you can see where those IP addresses are coming from. You can further enrich addresses in your logs with the AbuseIPDB, Net, and Shodan plugins.