# Why build an HTTP client into a database? So you can ingest web data directly!

> When there isn't a Steampipe plugin to meet your need, the Net plugin's net_http_request table can save the day.

By Jon Udell
Published: 2022-12-08


When a plugin provides a Steampipe table, it typically populates the table by making an API call, then mapping the JSON response to a Go struct which the Steampipe foreign data wrapper then maps to a Postgres table. But not all tables work this way. The [CSV plugin](https://hub.steampipe.io/plugins/turbot/csv)  builds Postgres tables from one or more CSV files. The [Code](https://hub.steampipe.io/plugins/turbot/csv) plugin provides one table, [code_secret](https://hub.steampipe.io/plugins/turbot/code/tables/code_secret), that you join with a column in any other table to scan that column for secrets. The [Trivy](https://hub.steampipe.io/plugins/turbot/trivy) plugin embeds the [Trivy scanner](https://github.com/aquasecurity/trivy) and runs it to populate tables. 

Steampipe tables can do all sorts of surprising things! Did you know, for example, that the [Net](https://hub.steampipe.io/plugins/turbot/net) plugin's [net_http_request](https://hub.steampipe.io/plugins/turbot/net/tables/net_http_request) encapsulates a full-blown HTTP client?


```sql
  select
    response_status_code,
    response_headers
  from
    net_http_request
  where
    url = 'https://steampipe.io'
```


```
+----------------------+-----------------------------------------------------------------------
| response_status_code | response_headers                                                                                                >
+----------------------+-----------------------------------------------------------------------
| 200                  | {"Access-Control-Allow-Origin":["*"],"Age":["102041"],"Cache-Control":
+----------------------+-----------------------------------------------------------------------
```

Here are three ways we've been using `net_http_request`.

## 1. Summarize the News and Reviews page

Our [News and Reviews](https://steampipe.io/blog/news-and-reviews) page lists articles about Steampipe published elsewhere. We wanted to chart them by month. The source of the data is the News and Reviews web page. Here's the query that produces data for the chart. 

```sql
-- generate a series of months
with months as (
  select 
    to_char(
      generate_series (
        date('2021-01-01'),
        now()::date,
        '1 month'::interval
      ),
      'YYYY-MM'
    ) as month
),
-- read in the article and match publication dates
article_months as (
  select
    (regexp_matches(response_body, '>(\d{4,4}\-\d{2,2})-\d{2,2}<', 'g'))[1] as article_month,
    count(*)
  from
    net.net_http_request
  where 
    url = 'https://steampipe.io/blog/news-and-reviews'
  group by
    article_month
  order by
    article_month
)
-- join with series of months, report month and count
select 
  month,
  coalesce(count, 0) as count
from 
  months 
left join
  article_months
on
  month = article_month
```

And here's that data displayed as a chart on a Steampipe dashboard.

<div style={{"width":"80%", "marginTop":"2em", "marginBottom":"2em", "borderStyle":"solid", "borderWidth":"1px", "borderColor":"#f3f1ef"}}>
  <img src="/images/blog/2022-11-net-http-request/articles-by-month.png" />
</div>

We find the dates by matching the HTML returned by `net_http_request` against a regular expression. The pattern we're looking for is easy to find that way.


## 2. Check links

 Links are another pattern that's easy to match, so we made a link checker that only relies on `net_http_request` and Postgres' `regexp_match` function. It embeds [this query](https://github.com/turbot/steampipe-samples/blob/main/all/linkcheck/query.sp) in a [dashboard](https://github.com/turbot/steampipe-samples/blob/main/all/linkcheck/links.sp) that enables you to choose a starting URL, check its links, and then click through to check any of those links. 

The result is a serviceable interactive link checker made with 100 lines of HCL and SQL!

<div style={{"marginTop":"2em", "marginBottom":"2em", "borderStyle":"solid", "borderWidth":"1px", "borderColor":"#f3f1ef"}}>
  <img src="/images/blog/2022-11-net-http-request/linkchecker.png" />
</div>

## 3. List ClickUp tasks

We just started using ClickUp to manage this blog, so of course we want to query our tasks and also correlate them with preview links available by way of the [Vercel](https://github.com/turbot/steampipe-plugin-html) plugin. 

Good news: [Graza Andersson](https://steampipe.io/blog/graza-andersson) has built a ClickUp plugin.

Bad news: A bug in the upstream Go SDK stopped us from using the plugin to list all our tasks.

Good news: `net_http_request` to the rescue!

We expect the SDK bug to be resolved soon, but meanwhile here's a query that gets the job done.

```
-- call the api
with response_body as (
  select 
    response_body::jsonb
  from  
  net_http_request 
  where
    url = 'https://api.clickup.com/api/v2/team/367...759/task'
    and request_headers = jsonb_object( array ['authorization', 'pk_44...VR4'] ) -- redacted secret
),
-- extract tasks
tasks as (
  select
    jsonb_array_elements( response_body -> 'tasks' ) as task
  from 
    response_body
)
-- report tasks
select
  task ->> 'name' as name,
  task -> 'status' ->> 'status' as status,
  task -> 'creator' ->> 'username' as creator,
  task -> 'assignees' -> 0 ->> 'username' as assignee,
  to_char( to_timestamp( (task ->> 'date_created')::numeric / 1000 ), 'YYYY-MM-DD' ) as date_created,
  to_char( to_timestamp( (task ->> 'date_closed')::numeric / 1000 ),  'YYYY-MM-DD') as date_closed,
  task ->> 'url' as url
from
  tasks
order by
  task ->> 'date_updated' desc
```

In this case it's a temporary solution, we'll soon have the much richer capabilities of the plugin. But what if there's no plugin for an API you'd like to use with Steampipe? The combined power of `net_http_request` and Postgres' JSON features might take you farther than you'd guess. 

## Two-way traffic

The examples we've seen here default to HTTP GET, but you can also specify `method = 'POST'` in a `net_http_request` query. And along with request headers, you can send a request body. We haven't yet needed to write a query that posts to an endpoint, but the possibilities are intriguing. Suppose, for example, your query returns data that you want to send as a notification to Slack. The conventional way would be to embed Steampipe [as a component](https://steampipe.io/blog/steampipe-as-component) in a program that runs the query, retrieves the result, and posts it to Slack. With a full-blown HTTP client built into Steampipe you can consider a more unified solution: Steampipe as both query engine and HTTP client. 

If you try using Steampipe this way, [let us know](https://steampipe.io/community/join) how it goes!

