Adopters

This document tracks people and use cases for the Prometheus Operator in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various the Prometheus Operator applications, operation environments, and cluster sizes. The Prometheus Operator development team may reach out periodically to check-in on how the Prometheus Operator is working in the field and update this list.

Go ahead and add your organization to the list.

AuthZed

authzed.com

Environments: AWS, Azure, Google Cloud

Uses kube-prometheus: Yes

Details (optional):

  • Every environment (internal and customer) leverages the Prometheus Operator for deploying metrics. Kube Prometheus is used for cluster metrics, that are managed by an HA Prometheus Stateful Set that runs the Thanos sidecar. Thanos is used to aggregate and query across multi-region/cluster environments. Alertmanager is used to page on-call SREs.

CERN

European Laboratory for Particle Physics

Environments: On-premises

Prometheus is used extensively as part of the CERN Kubernetes infrastructure, both managed and unmanaged. Metrics deployment is managed by the community owned kube-prometheus-stack helm chart. Be sure to check our blog.

Details:

  • 400+ Kubernetes clusters, with cluster sizes ranging from few nodes to ~100s

Significant usage also exists outside Kubernetes for generic service and infrastructure monitoring.

Clyso

clyso.com

Environments: Bare Metal, Opennebula

Uses kube-prometheus: Yes

Details:

  • multiple K8s cluster with prometheus deployed through prom-operator
  • several own ceph cluster providing metrics via ceph mgr prometheus module
  • several customer ceph clusters pushing metrics via external pushgateway to our our central monitoring instances
  • thanos receiver connected to own S3 storage

Coralogix

coralogix.com

Environments: AWS

Uses kube-prometheus: Yes

Details:

  • Operator installed on each Kubernetes cluster, with Thanos aggregating metrics from a central query endpoint
  • Two Prometheus instances per cluster
  • Loose coupling between Kubernetes cluster administrators who manage alerting sinks and service owners who define alerts for their services
  • 800K samples/s
  • 30M active series

DACHS IT

dachs-it.de

Environments: AWS, Azure, Bare Metal

Uses kube-prometheus: No

Details (optional):

  • HA Pair of Prometheus
  • 25k samples/s
  • 750k active series

Deckhouse

deckhouse.io

Environments: AWS, Azure, Google Cloud, Bare Metal

Uses kube-prometheus: Yes

Deckhouse is a Kubernetes Platform. Its clusters running on any infrastructure are provided with the monitoring system based on highly available Prometheus and Prometheus Operator. Essential metrics are preconfigured out-of-the-box to ensure monitoring of all levels, from hardware and Kubernetes internals to the platform’s modules functionality. The monitoring-custom module simplifies adding custom metrics for user applications. Deckhouse also hosts a dedicated Prometheus instance in each cluster to store downsampled metric series for longer periods.

Deezer

deezer.com

Environments: Bare Metal

Uses kube-prometheus: Yes

Details (optional):

  • HA Pair of Prometheus
  • 340000 samples/s
  • 14.3M active series

Giant Swarm

giantswarm.io

Environments: AWS, Azure, Bare Metal

Uses kube-prometheus: Yes (with additional tight Giant Swarm integrations)

Details:

  • One prometheus operator per management cluster and one prometheus instance per workload cluster
  • Customers can also install kube-prometheus for their workload using our App Platform
  • 760000 samples/s
  • 35M active series

Gitpod

gitpod.io

Environments: Google Cloud

Uses kube-prometheus: Yes (with additional Gitpod mixins)

Details:

  • One prometheus instance per cluster (8 so far)
  • 20000 samples/s
  • 1M active series

iFlytek

https://www.iflytek.com/

Environments: iflytek Cloud, etc

Uses kube-prometheus: Yes

Details (optional):

  • One prometheus operator per management cluster and one prometheus instance per workload cluster
  • 700000 samples/s
  • 30M active series

Innovaccer

https://innovaccer.com/

Environments: AWS, Azure

Uses kube-prometheus: Yes

Details (optional):

  • multiple remote K8s cluster in which we have prometheus deployed through prom-operator.
  • these remote prometheus instances push cluster metrics to central Thanos receiver which is connected to S3 storage.
  • on top of Thanos we have Grafana for dashboarding and visualisation.

Kinvolk Lokomotive Kubernetes

https://kinvolk.io/lokomotive-kubernetes/

Environments: AKS, AWS, Bare Metal, Equinix Metal

Uses kube-prometheus: Yes

Details:

  • Self-hosted (control plane runs as pods inside the cluster)
  • Deploys full K8s stack (as a distro) or managed Kubernetes (currently only AKS supported)
  • Deployed by Kinvolk for its own hosted infrastructure (including Flatcar Container Linux update server), as well as by Kinvolk customers and community users

Lunar

lunar.app

Environments: AWS

Uses kube-prometheus: Yes

Details:

  • One prometheus operator in our platform cluster and one prometheus instance per workload cluster
  • 17k samples/s
  • 841k active series

Mattermost

mattermost.com

Environments: AWS

Uses kube-prometheus: Yes

Details:

  • All Mattermost clusters use the Prometheus Operator with Thanos sidecar for cluster monitoring and central Thanos query component to gather all data.
  • 977k samples/s
  • 29.4M active series

Nozzle

nozzle.io

Environment: Google Cloud

Uses kube-prometheus: Yes

Details:

  • 100k samples/s
  • 1M active series

OpenShift

openshift.com

Environments: AWS, Azure, Google Cloud, Bare Metal

Uses kube-prometheus: Yes (with additional tight OpenShift integrations)

This is a meta user; please feel free to document specific OpenShift users!

All OpenShift clusters use the Prometheus Operator to manage the cluster monitoring stack as well as user workload monitoring. This means the Prometheus Operator’s users include all OpenShift customers.

Opstrace

https://opstrace.com

Environments: AWS, Google Cloud

Uses kube-prometheus: No

Opstrace installations use the Prometheus Operator internally to collect metrics and to alert. Opstrace users also often use the Prometheus Operator to scrape their own aplications and remote_write those metrics to Opstrace.

Polar Signals

polarsignals.com

Environment: Google Cloud

Uses kube-prometheus: Yes

Details:

  • HA Pair of Prometheus
  • 4000 samples/s
  • 100k active series

Robusta

Robusta docs

Environments: EKS, GKE, AKS, and self-hosted Kubernetes

Uses kube-prometheus: Yes

We’re an open source project that builds upon the awesome Prometheus Operator. We run automated playbooks in response to Prometheus alerts and other events in your cluster. For example, you can automatically fetch logs and send them to Slack when a Prometheus alert occurs. All it takes is this YAML:

triggers:
  - on_prometheus_alert:
      alert_name: KubePodCrashLooping
actions:
  - logs_enricher: {}
sinks:
  - slack

Skyscanner

skyscanner.net

Environment: AWS

Uses kube-prometheus: Yes

Details (optional):

  • HA Pairs of Prometheus
  • 25000 samples/s
  • 1.2M active series

SUSE Rancher

suse.com/products/suse-rancher

Environments: RKE, RKE2, K3s, Windows, AWS, Azure, Google Cloud, Bare Metal, etc.

Uses kube-prometheus: Yes

Rancher Monitoring supports use cases for Prometheus Operator across various different cluster types and setups that are managed via the Rancher product. All Rancher users that install Monitoring V2 deploy this chart.

For more information, please see how Rancher monitoring works.

The open-source rancher-monitoring Helm chart (based on kube-prometheus-stack) can be found at rancher/charts.

Trendyol

trendyol.com

Environments: OpenStack, VMware vCloud

Uses kube-prometheus: Yes

Details:

  • All Kubernetes clusters use one Prometheus Operator instance with remote write enabled
  • Prometheus instances push metrics to central H/A VirtualMetric, which gathers all data from clusters in 3 different data centers
  • Grafana is used for dashboarding and visualization
  • 7.50M samples/s
  • 190M active series

Veepee

veepee.com

Environments: Bare Metal

Uses kube-prometheus: Yes

Details (optional):

  • HA Pair of Prometheus
  • 786000 samples/s
  • 23.6M active series

VSHN AG

vshn.ch

Environments: AWS, Azure, Google Cloud, cloudscale.ch, Exoscale, Swisscom

Uses kube-prometheus: Yes

Details (optional):

  • A huge fleet of OpenShift and Kubernetes clusters, each using Prometheus Operator
  • All managed by Project Syn, leveraging Commodore Components like component-rancher-monitoring which re-uses Prometheus Operator

WarpBuild

warpbuild.com

Environments: AWS, Google Cloud

Uses kube-prometheus: Yes

Details (optional):

  • Prometheus Operator provides real-time monitoring of all our runners. With AlertManager, we promptly receive notifications regarding any cluster issues, allowing for swift resolution before they have the chance to escalate into outages. Grafana allows us to quickly visualize our cluster’s health and performance metrics.
  • ~6k samples/s
  • ~500k active series

Wise

wise.com

Environments: Kubernetes, AWS (via some EC2)

Uses kube-prometheus: No

Details (optional):

  • About 30 HA pairs of sharded Promethei across 10 environments, wired together with Thanos
  • Operator also helps us seamlessly manage anywhere between 600-1500 short-lived prometheus instances for our “integration” kubernetes cluster.
  • ~15mn samples/s
  • ~200mn active series

<Insert Company/Organization Name>

https://our-link.com/

Environments: AWS, Azure, Google Cloud, Bare Metal, etc

Uses kube-prometheus: Yes | No

Details (optional):

  • HA Pair of Prometheus
  • 1000 samples/s (query: rate(prometheus_tsdb_head_samples_appended_total[5m]))
  • 10k active series (query: prometheus_tsdb_head_series)