Prometheus is the de-facto open-source monitoring standard for cloud-native environments. If you work with AWS ECS, Kubernetes, or any distributed system, you will encounter it. In this blog, we tear down every component of the Prometheus architecture โ visually, with real-world ECS context โ so you walk away truly understanding how the pieces fit together.
In this blog, you will learn the following:
- What is Prometheus Architecture?
- Prometheus Server
- Time-Series Database (TSDB)
- Prometheus Targets
- Prometheus Exporters
- Prometheus Service Discovery
- Prometheus Pushgateway
- Prometheus Client Libraries
- Prometheus Alert Manager
- PromQL
Prometheus Architecture โ ECS Flow
The diagram below shows how all Prometheus components interact in a typical AWS ECS environment. Watch the data flows between components.
1. What is Prometheus?
Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud, now a graduated CNCF project. It is written in Go and follows a pull-based metrics collection model โ instead of agents pushing data to a central server, Prometheus actively scrapes (HTTP GET) the /metrics endpoint of targets at a regular interval.
Key idea: Prometheus stores everything as time-series data โ a stream of timestamped values identified by a metric name and key-value labels. e.g.
http_requests_total{method="GET", status="200"}
2. Prometheus Server
The Prometheus server is the heart of the system. It has three internal sub-components that work together:
:9090. Grafana and other tools use this endpoint to run queries and render dashboards.3. Time-Series Database (TSDB)
Prometheus's built-in TSDB is purpose-built for time-series workloads. Unlike relational databases, it stores data in chunks of 2-hour blocks that are compacted over time to save disk space.
# /prometheus/data/ . โโโ 01H8G3... # 2-hour block (immutable) โ โโโ chunks/ โ โ โโโ 000001 โ โโโ index โ โโโ meta.json โโโ 01H8G4... # another block โโโ wal/ # Write-Ahead Log (current) โ โโโ 00000001 โ โโโ checkpoint.000005/ โโโ lock
4. Prometheus Targets
A target is any endpoint that exposes metrics in the Prometheus text format at a /metrics HTTP path. In an ECS environment, targets can be:
- ECS Tasks โ your microservices instrumented with a client library
- Node Exporter โ running as a sidecar or daemon on the EC2 host
- cAdvisor โ container-level CPU, memory, and network metrics
- Custom exporters โ e.g., ECS task metadata endpoint wrapped as a Prometheus target
- AWS services โ via
yet-another-cloudwatch-exporter(YACE)
5. Prometheus Exporters
Exporters are adapter processes that translate metrics from systems that don't natively speak the Prometheus format into something Prometheus can scrape. They run alongside the target and expose a /metrics endpoint.
6. Prometheus Service Discovery
Hard-coding scrape targets is not scalable. Service Discovery (SD) lets Prometheus automatically find and track targets as they come and go โ critical in dynamic ECS environments where tasks start and stop frequently.
global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: # File-based SD โ ECS task IPs written by deployment script - job_name: 'ecs-tasks' file_sd_configs: - files: ['/etc/prometheus/targets/ecs-*.json'] refresh_interval: 30s # DNS SD for ECS Service Connect / Cloud Map - job_name: 'ecs-service-connect' dns_sd_configs: - names: ['_metrics._tcp.my-svc.local'] type: SRV # Static targets โ node_exporter on EC2 hosts - job_name: 'node' static_configs: - targets: ['10.0.1.10:9100', '10.0.1.11:9100']
7. Prometheus Pushgateway
The Pushgateway solves a specific problem: short-lived jobs (ECS batch tasks, cron jobs, one-off tasks) that finish before Prometheus gets a chance to scrape them. These jobs push their metrics to the Pushgateway, which Prometheus then scrapes like any other target.
When to use: ECS Fargate batch jobs, nightly ETL pipelines, data export tasks. Do NOT use as a general-purpose proxy โ it breaks stale data detection and the pull model.
#!/bin/bash PUSHGATEWAY="http://pushgateway:9091" JOB="nightly_etl" cat <<EOF | curl --data-binary @- "${PUSHGATEWAY}/metrics/job/${JOB}" # HELP etl_duration_seconds Time taken for the ETL job # TYPE etl_duration_seconds gauge etl_duration_seconds 142 # HELP etl_records_processed Total records processed # TYPE etl_records_processed counter etl_records_processed 48503 EOF
8. Prometheus Client Libraries
Client libraries let you instrument your application code directly โ exposing custom business metrics alongside default runtime metrics.
github.com/prometheus/client_golang โ the reference implementation. Best-in-class for Go microservices in ECS.prometheus_client โ great for ML model servers, Flask APIs, and Django apps running in ECS Fargate./actuator/prometheus out-of-the-box.prom-client โ widely used with Express.js. Automatically exposes default Node.js metrics (event loop lag, GC, heap).from prometheus_client import Counter, Histogram, start_http_server import time REQUEST_COUNT = Counter( 'http_requests_total', 'Total HTTP requests', ['method', 'endpoint', 'status'] ) REQUEST_LATENCY = Histogram( 'http_request_duration_seconds', 'Request latency', ['endpoint'], buckets=[0.01, 0.05, 0.1, 0.5, 1.0] ) def handle_request(method, endpoint): start = time.time() # ... process request ... REQUEST_COUNT.labels(method=method, endpoint=endpoint, status="200").inc() REQUEST_LATENCY.labels(endpoint=endpoint).observe(time.time() - start) start_http_server(8000) # exposes /metrics on :8000
9. Prometheus AlertManager
AlertManager handles alerts sent by the Prometheus server. It is responsible for deduplication, grouping, routing, silencing, and inhibition before dispatching notifications.
rules.yml. Prometheus evaluates these every evaluation_interval. When an expression is true for longer than for, the alert fires.groups: - name: ecs-alerts rules: # Alert when ECS container CPU > 90% for 5 minutes - alert: ECSHighCPU expr: rate(container_cpu_usage_seconds_total[5m]) * 100 > 90 for: 5m labels: severity: critical annotations: summary: "ECS task {{ $labels.container_name }} high CPU" description: "CPU at {{ $value }}%" # Alert when error rate > 5% - alert: HighErrorRate expr: | sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100 > 5 for: 2m labels: severity: warning annotations: summary: "HTTP error rate above 5%: {{ $value | printf \"%.1f\" }}%"
10. PromQL โ Prometheus Query Language
PromQL is a functional query language for selecting and aggregating time-series data. Every Grafana panel and alert expression uses PromQL under the hood.
The Four Metric Types
http_requests_total. Always use with rate() โ never read raw.memory_usage_bytes, active_connections. Read the raw value directly.histogram_quantile().Essential PromQL Cheat Sheet
| Use Case | PromQL Expression |
|---|---|
| HTTP request rate | rate(http_requests_total[5m]) |
| 5xx error rate (%) | sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100 |
| p99 request latency | histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) |
| ECS container CPU % | rate(container_cpu_usage_seconds_total[5m]) * 100 |
| ECS container memory % | container_memory_usage_bytes / container_spec_memory_limit_bytes * 100 |
| Avg latency by service | avg by (job) (rate(http_request_duration_seconds_sum[5m]) / rate(http_request_duration_seconds_count[5m])) |
| Active ECS targets | count(up{job="ecs-tasks"} == 1) by (job) |
| Targets currently down | up == 0 |
PromQL Operators Quick Reference
rate(metric[5m])โ per-second rate of a counter over 5 minutesirate(metric[5m])โ instantaneous rate (last 2 samples) โ more responsive but spikyincrease(metric[1h])โ total increase of a counter over 1 hoursum by (label)โ aggregate, grouped by a labelavg without (instance)โ average, dropping the instance labeltopk(5, metric)โ top 5 time-series by valuepredict_linear(metric[1h], 3600)โ predict value in 1 hour via linear regression
Architecture Summary โ How it all flows
/metrics, or you run exporters (node_exporter, cAdvisor) alongside them.Prometheus + Grafana + AlertManager is the golden observability stack for ECS environments. Master these three tools and you have complete visibility into every layer of your infrastructure โ from EC2 host metrics all the way up to business-level request rates and SLOs.