Skill v1.0.0
currentAutomated scan100/100name: deploying-monitoring-stacks description: | Monitor use when deploying monitoring stacks including Prometheus, Grafana, and Datadog. Trigger with phrases like "deploy monitoring stack", "setup prometheus", "configure grafana", or "install datadog agent". Generates production-ready configurations with metric collection, visualization dashboards, and alerting rules. allowed-tools: Read, Write, Edit, Grep, Glob, Bash(docker:), Bash(kubectl:) version: 1.0.0 author: Jeremy Longshore <jeremy@intentsolutions.io> license: MIT
Monitoring Stack Deployer
This skill provides automated assistance for monitoring stack deployer tasks.
Overview
Deploys monitoring stacks (Prometheus/Grafana/Datadog) including collectors, scraping config, dashboards, and alerting rules for production systems.
Prerequisites
Before using this skill, ensure:
- Target infrastructure is identified (Kubernetes, Docker, bare metal)
- Metric endpoints are accessible from monitoring platform
- Storage backend is configured for time-series data
- Alert notification channels are defined (email, Slack, PagerDuty)
- Resource requirements are calculated based on scale
Instructions
- Select Platform: Choose Prometheus/Grafana, Datadog, or hybrid approach
- Deploy Collectors: Install exporters and agents on monitored systems
- Configure Scraping: Define metric collection endpoints and intervals
- Set Up Storage: Configure retention policies and data compaction
- Create Dashboards: Build visualization panels for key metrics
- Define Alerts: Create alerting rules with appropriate thresholds
- Test Monitoring: Verify metrics flow and alert triggering
Output
Prometheus + Grafana (Kubernetes):
# {baseDir}/monitoring/prometheus.yaml## OverviewThis skill provides automated assistance for the described functionality.## ExamplesExample usage patterns will be demonstrated in context.apiVersion: v1kind: ConfigMapmetadata:name: prometheus-configdata:prometheus.yml: |global:scrape_interval: 15sevaluation_interval: 15sscrape_configs:- job_name: 'kubernetes-pods'kubernetes_sd_configs:- role: podrelabel_configs:- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action: keepregex: true---apiVersion: apps/v1kind: Deploymentmetadata:name: prometheusspec:replicas: 1template:spec:containers:- name: prometheusimage: prom/prometheus:latestargs:- '--config.file=/etc/prometheus/prometheus.yml'- '--storage.tsdb.retention.time=30d'ports:- containerPort: 9090
Grafana Dashboard Configuration:
{"dashboard": {"title": "Application Metrics","panels": [{"title": "CPU Usage","type": "graph","targets": [{"expr": "rate(container_cpu_usage_seconds_total[5m])"}]}]}}
Error Handling
Metrics Not Appearing
- Error: "No data points"
- Solution: Verify scrape targets are accessible and returning metrics
High Cardinality
- Error: "Too many time series"
- Solution: Reduce label combinations or increase Prometheus resources
Alert Not Firing
- Error: "Alert condition met but no notification"
- Solution: Check Alertmanager configuration and notification channels
Dashboard Load Failure
- Error: "Failed to load dashboard"
- Solution: Verify Grafana datasource configuration and permissions
Examples
- "Deploy Prometheus + Grafana on Kubernetes and add alerts for high error rate and latency."
- "Install Datadog agents across hosts and configure a dashboard for CPU/memory saturation."
Resources
- Prometheus documentation: https://prometheus.io/docs/
- Grafana documentation: https://grafana.com/docs/
- Example dashboards in {baseDir}/monitoring-examples/