Setting Up Alert Policies, Contact Points & Notification Policies in Grafana
- 1. How Grafana Alerting Works
- 2. Prerequisites & Enabling Unified Alerting
- 3. Setting Up Contact Points
- 4. Configuring Email Contact Points
- 5. Configuring Webhook Contact Points
- 6. Other Contact Point Types
- 7. Creating Alert Rules (Alert Policies)
- 8. Setting Up Notification Policies
- 9. Mute Timings & Silences
- 10. Testing & Troubleshooting
- 11. Best Practices
1. How Grafana Alerting Works
Grafana Unified Alerting has three core building blocks that work together:
severity, team, or env.The complete flow:
↓
Notification Policy matches labels → routes to Contact Point
↓
Email / Webhook / Slack / PagerDuty receives the alert
2. Prerequisites & Enabling Unified Alerting
Make sure Grafana's unified alerting is enabled and legacy alerting is off. Edit your grafana.ini file:
# /etc/grafana/grafana.ini [unified_alerting] enabled = true [alerting] enabled = false # disable the old legacy alerting engine
For Docker users, pass these as environment variables:
docker run -d \ -e GF_UNIFIED_ALERTING_ENABLED=true \ -e GF_ALERTING_ENABLED=false \ grafana/grafana
Restart Grafana after saving. You should now see Alerting in the left sidebar with sub-menus for Alert rules, Contact points, and Notification policies.
3. Setting Up Contact Points — Overview
A contact point is a named destination where Grafana sends alert notifications. One contact point can bundle multiple integrations — for example, one contact point that sends an email AND posts to Slack at the same time.
To create a contact point:
- In Grafana, go to Alerting → Contact points in the left sidebar.
- Click Add contact point (top right corner).
- Give it a clear name, e.g.
ops-team-emailorcritical-webhook. - Click Add contact point integration and choose a type from the dropdown.
- Fill in the type-specific settings (covered in sections below).
- Click Test to send a test notification — always verify before saving.
- Click Save contact point.
4. Configuring Email Contact Points
Email is the most common contact point. First, configure your SMTP server in grafana.ini:
[smtp] enabled = true host = smtp.gmail.com:587 user = you@gmail.com password = your_app_password_here from_address = grafana-alerts@yourdomain.com from_name = Grafana Alerts startTLS_policy = MandatoryStartTLS # For SSL on port 465 instead: # host = smtp.gmail.com:465 # skip_verify = true
Then in the Grafana UI, fill in the email integration fields:
| Field | Value / Notes |
|---|---|
| Addresses | One or more recipients separated by semicolons: ops@company.com;cto@company.com |
| Single email | Toggle ON to send one combined email instead of separate emails per recipient |
| Message | Optional custom body template. Leave blank to use Grafana's default template. |
| Subject | Optional. Default includes alert name and current status automatically. |
Custom email template (optional)
# Alerting → Contact points → email → Message field
{{ define "custom.email.subject" }}
[{{ .Status | toUpper }}] {{ .CommonLabels.alertname }} on {{ .CommonLabels.instance }}
{{ end }}
{{ define "custom.email.body" }}
Alert : {{ .CommonLabels.alertname }}
Status : {{ .Status }}
Summary : {{ .CommonAnnotations.summary }}
Firing : {{ .Alerts.Firing | len }} alert(s)
Resolved: {{ .Alerts.Resolved | len }} alert(s)
{{ end }}5. Configuring Webhook Contact Points
Webhooks send a JSON POST request to any HTTP endpoint. Use them for custom integrations, Discord bots, Telegram bots, internal ticketing systems, or any service that accepts HTTP calls.
| Field | Value / Notes |
|---|---|
| URL | Your HTTPS endpoint: https://yourdomain.com/alerts/grafana |
| HTTP method | POST (default, recommended) |
| Username / Password | Basic auth credentials if your endpoint requires authentication |
| Authorization header | Bearer token e.g. Bearer eyJhbGci... |
| Max alerts | Cap the number of alerts per call to prevent flooding (e.g. 10) |
| Custom headers | Any extra HTTP headers your endpoint needs |
Grafana webhook JSON payload
{
"receiver": "my-webhook",
"status": "firing", // "firing" or "resolved"
"orgId": 1,
"alerts": [
{
"status": "firing",
"labels": {
"alertname": "HighCPU",
"instance": "web-server-01",
"severity": "critical"
},
"annotations": {
"summary": "CPU above 90% on web-server-01",
"description": "CPU critically high for over 5 minutes"
},
"startsAt": "2026-05-23T10:15:00Z",
"generatorURL": "http://grafana:3000/alerting/...",
"fingerprint": "abc123def456"
}
],
"title": "[FIRING:1] HighCPU",
"message": "CPU above 90% on web-server-01"
}Simple Node.js receiver example
const express = require('express'); const app = express(); app.use(express.json()); app.post('/alerts/grafana', (req, res) => { const { status, alerts } = req.body; alerts.forEach(alert => { console.log(`[${status.toUpperCase()}] ${alert.labels.alertname}`); console.log(` Instance : ${alert.labels.instance}`); console.log(` Summary : ${alert.annotations.summary}`); }); res.status(200).send('OK'); }); app.listen(3001, () => console.log('Webhook receiver running on :3001'));
6. Other Contact Point Types
| Type | Key Field(s) | Best for |
|---|---|---|
| Slack | Incoming Webhook URL + channel name | Team chat notifications |
| Microsoft Teams | Incoming Webhook URL from Teams connector | Corporate Teams workspaces |
| PagerDuty | Integration key from PagerDuty service settings | On-call escalation & scheduling |
| OpsGenie | API key + region | Incident management |
| Telegram | Bot token + chat ID | Mobile push via Telegram bot |
| Discord | Webhook URL from Discord server settings | Dev/gaming team channels |
| Google Chat | Space webhook URL | Google Workspace teams |
| VictorOps | API key + routing key | On-call scheduling |
critical-ops. When a critical alert fires, both channels receive the notification simultaneously from a single routing rule.
7. Creating Alert Rules (Alert Policies)
Go to Alerting → Alert rules → New alert rule. The editor walks through five steps.
Step 1 — Define the query and condition
Choose your data source and write the query. Common examples:
Server / instance down (Prometheus)
up{job="node"} == 0High CPU usage above 90%
100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90Memory usage above 85%
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85
Disk space below 10% free
(node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10HTTP endpoint down (Blackbox Exporter)
probe_success{job="blackbox"} == 0Network packet drops
rate(node_network_receive_drop_total[5m]) > 100
Error log spike — Loki
count_over_time({app="my-app"} |= "ERROR" [5m]) > 10Step 2 — Set evaluation behaviour
| Setting | Recommended value | Purpose |
|---|---|---|
| Evaluation group | infrastructure | Group related rules for shared evaluation interval |
| Evaluate every | 1m | How often Grafana checks the condition |
| For (pending period) | 3m – 5m | Condition must hold this long before the alert fires — prevents flapping on short spikes |
Step 3 — Add labels and annotations
# Labels — used by notification policies for routing severity = critical team = backend env = production # Annotations — shown inside the notification message summary = "CPU above 90% on {{ $labels.instance }}" description = "CPU has been critically high for 5+ minutes. Check process list." runbook_url = "https://wiki.company.com/runbooks/high-cpu"
Step 4 — Set alert state for missing data
| Scenario | Recommended setting |
|---|---|
| Query returns no data | Set to Alerting if no data means the service is likely down |
| Query execution error | Set to Alerting to catch scrape failures |
| Metric gaps are normal | Use Keep last state to avoid false positives during gaps |
Step 5 — Save the rule
Click Save rule and exit. The rule now appears in the alert list with a live status indicator: Normal Pending Firing No Data
8. Setting Up Notification Policies
Notification policies are Grafana's routing engine. They inspect an alert's labels and decide which contact point receives it. Think of it as a nested if label matches → send to decision tree.
Go to Alerting → Notification policies.
The root (default) policy
Every Grafana instance has one root policy that acts as a catch-all. Alerts that do not match any nested policy fall through to the root. Always configure the root policy first.
# Recommended root policy settings Contact point : ops-team-email Group by : alertname, instance Group wait : 30s # delay before sending first notification in a group Group interval : 5m # delay before notifying about new alerts in same group Repeat interval : 4h # re-notify if alert is still firing after this time
Adding nested (specific) policies
Click Add nested policy under the root. Each policy uses label matchers to select which alerts it handles:
| Alert scenario | Label matcher | Route to contact point |
|---|---|---|
| Critical alerts → on-call | severity = critical | pagerduty-oncall |
| Warning alerts → Slack | severity = warning | slack-warnings |
| Production alerts → CTO email | env = production | cto-email |
| Backend team alerts | team = backend | backend-slack |
| Database alerts | alertname =~ ".*DB.*" | dbteam-webhook |
| Network alerts → NOC | alertname =~ ".*Network.*" | noc-email |
Label matcher operators
| Operator | Meaning | Example |
|---|---|---|
= | Exact match | severity = critical |
!= | Not equal | env != staging |
=~ | Regex match | alertname =~ ".*CPU.*" |
!~ | Regex not match | team !~ "frontend|design" |
9. Mute Timings & Silences
Mute timings — recurring suppression
Mute timings suppress notifications during planned recurring windows — weekly maintenance, overnight quiet hours, or weekends. Go to Alerting → Mute timings → Add mute timing.
# Suppress alerts every Sunday 2am–4am (weekly maintenance window)
Name : weekly-maintenance
Time ranges : 02:00 – 04:00
Days of week : Sunday
Months : (leave blank = applies every month)Attach a mute timing to any notification policy by editing the policy and selecting it from the mute timing dropdown. The alert rule still evaluates — only the notification delivery is suppressed.
Silences — immediate one-off suppression
Silences are instant suppressions for unplanned situations — a live deployment, a known flapping metric, or an incident you are already investigating. Go to Alerting → Silences → Add silence.
Label matchers : instance = web-server-01 Start : 2026-05-23 14:00 End : 2026-05-23 16:00 Comment : Silencing during blue/green deploy on web-server-01
10. Testing & Troubleshooting
Test contact points
Every contact point in Alerting → Contact points has a Test button (paper plane icon). Use it to send a live test notification immediately — always do this after creating or changing a contact point.
Test notification policy routing
In Alerting → Notification policies, click Test routing and enter a set of labels. Grafana shows exactly which policy would match and which contact point would receive the alert — without actually sending anything.
Common issues and fixes
| Problem | Likely cause | Fix |
|---|---|---|
| Email not received | SMTP misconfigured or wrong app password | Check grafana.ini SMTP block; use Test button; check Grafana server logs |
| Webhook returns 401 | Missing or incorrect auth header | Add correct Bearer token or Basic auth credentials in contact point settings |
| Alert stuck in Pending | The For duration has not elapsed yet | This is normal — wait for the pending period to pass, or reduce it temporarily for testing |
| Alert fires but no notification sent | No matching notification policy | Use Test routing to verify labels match; confirm root policy has a contact point assigned |
| Alert shows "No data" | Query returning empty results | Set No data behaviour in the rule to Alerting or Keep last state |
| Too many repeated alerts | Repeat interval too short | Increase Repeat interval in the notification policy (e.g. 4h or 8h instead of 1h) |
| Webhook not reachable | Grafana server cannot reach the URL | Confirm the URL is accessible from the Grafana server; check firewall rules and TLS certificate |
11. Best Practices
severity, team, env, and service across all rules. Consistent labels make notification policies simple and predictable.runbook_url annotation to every alert. When a page fires at 2am, the on-call engineer knows what to do immediately without searching through docs.Final thoughts
A solid Grafana alerting setup needs all three pieces working together — well-crafted alert rules that fire on meaningful signals, contact points that reliably reach the right people, and notification policies that route intelligently based on labels. Start simple: one email contact point, one root policy, and your five most critical alert rules. Then layer in Slack, PagerDuty, webhook integrations, and fine-grained routing as your team's needs grow.
Comments
Post a Comment