v0.8.0: Variables, Tagging mods & Syntax highlighting →
Research

Analysis of Cloud Provider Market Share – 2021

Using developer interest in infrastructure as code tools to gauge popularity of cloud providers.

David Boeke
9 min. read - March 8, 2021

Steampipe is an open source tool that allows you to query your cloud infrastructure with SQL. While we built it for ourselves, and our primary use case was AWS, we decided to build a robust plugin capability and to support multiple cloud providers from day one.

As we neared the launch date we had to make difficult trade-off decisions on what would be in-scope for the release. This included deciding which plugins we would focus on first. Gauging the size of community around each of the cloud platforms helped us make that a data-based decision, and we thought publishing what we found might be insightful to others.

Infrastructure as Code Market Share

If you are building cloud infrastructure in 2021, you are building infrastructure as code using a declarative templating language. Let’s see what we can learn by following the developers and the development community around these tools.

First Party Tools from the Cloud Vendors:

Each of the cloud service providers has a github repo with example templates to serve as starting guides for your development.

It is curious that AWS does not open source all of their templates, they have a few hundred more templates available in docs:

Azure has done well to create an open source community hub around resource manager, while Amazon has allowed a long tail of smaller repositories. I prefer Azure’s approach here, as it is more likely (especially if new to the platform) that you will be able to join, contribute and have people find your work.

Google doesn’t seem to have a strategy to promote, or a community that cares much about Deployment Manager (there isn’t even a github tag available to search on). Most of the devs doing infrastructure as code on GCP must be using Terraform.

Oracle has done a significant amount of first party work to create infrastructure templates and publish them in advance of building the larger community.

DigitalOcean actually has a thriving open source community with 140 repositories that have some type of example, but their concept of infrastructure as code is focused primarily on the operating system vs broader IaaS configuration.

Search Statistics

We can get an initial feel for the relative popularity of these platforms using Google Search Trends:

There is a huge drop off of interest across all platforms in Q4 last year.

CloudFormation's popularity eclipses that of other native tools.

Cloudformation has a huge advantage over the other platforms native tools. Let's see if Terraform helps level the playing field.

Terraform Usage as a Surrogate Metric

When building infrastructure as code, Terraform is the 800 lb gorilla in the room. Operating across all cloud providers their open source repos and developer communities dwarf those of the first party clients. Terraform’s tooling works across all the cloud providers due to their plugin architecture, and we can learn quite a bit from stargazing the various plugin repositories.

Terraform Providers by the Numbers

Repo# Downloads# Stars# Contrib
terraform-provider-aws268.2M5.5k1,744
terraform-provider-azurerm41.0M2.3k800
terraform-provider-google38.4M1.3k431

AWS is 6.5x larger on Terraform

(and that is on top of their CloudFormation numbers.)

It is clear from these metrics that all three hypercloud companies have massive user bases and healthy growth curves in terms of usage of infrastructure as code tools. Microsoft recovered from an early slow start and has been on a higher growth trajectory since Q1 2019, but Amazon’s lead is real and it continues to grow.

Alternative Clouds

Repo# Downloads# Stars# Contrib
terraform-provider-oci1.4M39669
terraform-provider-digitalocean202K294119
terraform-provider-alicloud174K33498

It is great to see healthy active communities around each of these platforms, but combined the alternative platforms are barely 1/20th of the usage of even GCP at this stage.

What can your questions tell us?

When you are working with a technical platform, you are going to have questions, and the number of questions generally correlate with an increased number of developers and increased usage of the platform. We will use data from Stack Overflow in this section, specifically looking at the number of questions that are tagged with specific categories.

Stack Overflow categorizes questions based on a tagging system. Here are results for questions tagged with each cloud service provider's name:


The Azure numbers being on par with AWS seemed surprising given we didn’t see that in other places, but it makes more sense when you realize that there are things like Azure DevOps and Azure AD so Azure represents more than just PaaS and IaaS. lets see if we can narrow down to our target audience by looking specifically at questions related to infrastructure as code tools:

These numbers align more closely with the relative size of the platforms we have seen in other communities.

Karma Counts

The last area we considered in our research was the size of the fan base for each cloud platform. Both Twitter and Reddit give us an easy way to measure the size of the social graph for these companies and the cloud platforms themselves:

Subreddit# Members
r/aws161,000
r/azure66,600
r/googlecloud20,900
r/terraform15,100
r/cloudcomputing14,700
r/cloud10,200
r/digital_ocean2,100
r/oraclecloud1,100
r/AlibabaCloud191

A subreddit is a community of people on Reddit dedicated to sharing information and news on a given topic. The size of the subreddit indicates the number of people who are members of that community.

TwitterFollowers
@awscloud1,800,000
@azure799,500
@googlecloud290,900
@digitalocean205,500
@oraclecloud81,200
@hashicorp66,000
@alibaba_cloud61,800

Kudo’s to the @digitalocean twitter team, they are hitting way above expectations given their relative size.

Conclusions

Our analysis of this data gave our development team confidence to deep dive on AWS and Azure first. For the broader cloud providers, we made sure we have coverage, but leave a lot of room for the communities around these tools to jump in and fill the remaining gaps. One of the brilliant parts of open source is that our community can contribute and extend where they have passion.

Using Steampipe we can query our embedded PostgreSQL database to see current coverage across cloud providers:

>
select
split_part(table_name, '_', 1) as cloud,
count(*) as tables
from
information_schema.tables
where
split_part(table_name, '_', 1) in ('aws','azure','digitalocean','alicloud','gcp')
group by 1
order by 2 desc;

+--------------+--------+
| cloud        | tables |
+--------------+--------+
| aws          | 85     |
| gcp          | 42     |
| azure        | 38     |
| alicloud     | 22     |
| digitalocean | 14     |
+--------------+--------+
    

What cloud are you building on?

Regardless of your cloud platform choice, Steampipe has you covered with our own multi-cloud plugins. We hope it is both delightful and a huge time saver for you in your day-to-day cloud work.

If you’d like to help expand the Steampipe universe, or even dive into the CLI code, the whole project is open source (https://github.com/turbot/steampipe) and we’d love to collaborate!

Download, install, and get cloud work done with Steampipe.