March 23

Метрики Qrator (Qrator Exporter)

В очередной раз пришлось настраивать сбор метрик с Qrator, прошлая моя заметка на этот счет жила в виде Issue в репозитории StupidScience/qrator-exporter (в проекте используются deprecated-методы), но автор решил её удалить, поэтому опишу здесь, чтобы уж точно не потерялось.

Источник qratorlabs.medium.com

Сбор данных будет осуществляться через telegraf и, с помощью него же, отдаваться в виде метрик формата Prometheus.

Для начала потребуется получить API-токен для получения данных из Qrator, для этого переходим в раздел с ключами в личном кабинете и выпускаем токен.

Далее переходим в список доменов и сохраняем их идентификаторы, по ним будет обращение к методам API:

Здесь 11111 и 11222 – как раз те самые идентификаторы доменов, теперь описываем конфигурацию для телеграфа:

[[inputs.http]]
	name_prefix = "qrator_blocks_"
	method = "POST"
	urls = [
		"https://api.qrator.net/request/domain/11111",
		"https://api.qrator.net/request/domain/11222",
	]
	headers = {"X-Qrator-Auth" = "${QRATOR_API_KEY}", "Content-Type" = "application/json"}
	body = '{"method":"statistics_current_blocks"}'
	data_format = "json"
	timeout = "30s"

[[inputs.http]]
	name_prefix = "qrator_http_"
	method = "POST"
	urls = [
		"https://api.qrator.net/request/domain/11111",
		"https://api.qrator.net/request/domain/11222",
	]
	headers = {"X-Qrator-Auth" = "${QRATOR_API_KEY}", "Content-Type" = "application/json"}
	body = '{"method":"statistics_current_http"}'
	data_format = "json"
	timeout = "30s"

[[inputs.http]]
	name_prefix = "qrator_ip_"
	method = "POST"
	urls = [
		"https://api.qrator.net/request/domain/11111",
		"https://api.qrator.net/request/domain/11222",
	]
	headers = {"X-Qrator-Auth" = "${QRATOR_API_KEY}", "Content-Type" = "application/json"}
	body = '{"method":"statistics_current_ip"}'
	data_format = "json"
	timeout = "30s"

[[inputs.http]]
	name_prefix = "qrator_locations_"
	method = "POST"
	urls = [
		"https://api.qrator.net/request/domain/11111",
		"https://api.qrator.net/request/domain/11222",
	]
	headers = {"X-Qrator-Auth" = "${QRATOR_API_KEY}", "Content-Type" = "application/json"}
	body = '{"method":"statistics_current_locations"}'
	data_format = "json"
	timeout = "30s"

[[outputs.prometheus_client]]
	listen = ":9273"

В поле urls передается массив из ссылок на ресурсы (включают в себя идентификаторы доменов), в поле body – метод, а для передачи API-ключа используется переменная окружения QRATOR_API_KEY, нам нужно будет её дополнительно передать телеграфу, чтобы не хранять напрямую в конфигурации.

Осталось только запустить. Минифицированный Deployment для kustomize может выглядеть так:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: qrator-exporter
spec:
  template:
    spec:
      containers:
        - name: telegraf
          image: telegraf:1.21.4
          ports:
            - name: metrics
              containerPort: 9273
          env:
            - name: QRATOR_API_KEY
              value: CHANGE_ME
          securityContext:
            runAsUser: 1001
            capabilities:
              drop:
                - ALL
            readOnlyRootFilesystem: true
            runAsNonRoot: true
          volumeMounts:
            - name: config
              mountPath: "/etc/telegraf"
              readOnly: true
            - name: cache
              mountPath: "/.cache"
      volumes:
        - name: config
          secret:
            secretName: qrator-exporter
        - name: cache
          emptyDir: {}

Сам секрет qrator-exporter описывается в файле kustomization.yaml, например:

secretGenerator:
  - name: qrator-exporter
    files:
      - config/telegraf.conf

Не забываем описать сервис и Service Monitor:

---
apiVersion: v1
kind: Service
metadata:
  name: qrator-exporter
spec:
  type: ClusterIP
  ports:
    - name: metrics
      port: 9273
      targetPort: 9273
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: qrator-exporter
spec:
  endpoints:
    - interval: 30s
      path: /metrics
      port: metrics
  selector:
    matchLabels:
      app.kubernetes.io/name: qrator-exporter
      app.kubernetes.io/component: service
      app.kubernetes.io/part-of: monitoring
  namespaceSelector:
    any: true

Селектор по лейблам, которые заданы в kustomization.yaml:

---
commonLabels:
  app.kubernetes.io/name: qrator-exporter
  app.kubernetes.io/component: service
  app.kubernetes.io/part-of: monitoring

После этого мы начнем собирать метрики, однако в качестве url в метриках будет непонятный адрес ресурса Qrator, поэтому добавляем релейбл:

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: qrator-exporter
spec:
  endpoints:
    - interval: 30s
      path: /metrics
      port: metrics
      metricRelabelings:
        - sourceLabels: ["url"]
          regex: https://api.qrator.net/request/domain/(.+)
          replacement: $1
          targetLabel: domain_id
          action: replace
        - sourceLabels: ["url"]
          regex: https://api.qrator.net/request/domain/11111
          replacement: domain.ru
          targetLabel: domain_name
          action: replace
        - sourceLabels: ["url"]
          regex: https://api.qrator.net/request/domain/11222
          replacement: super-domain.ru
          targetLabel: domain_name
          action: replace
  selector:
    matchLabels:
      app.kubernetes.io/name: qrator-exporter
      app.kubernetes.io/component: service
      app.kubernetes.io/part-of: monitoring
  namespaceSelector:
    any: true

Теперь в domain_name будет читаемый параметр, который можно использовать для селекторов в Grafana или в алертах.

Примеры алертов для Prometheus Operator:

---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: qrator-exporter
  labels:
    app: prometheus-operator
    release: "monitoring"
spec:
  groups:
    - name: QratorExporter
      rules:
        - alert: QratorHighBandwidthInput
          expr: qrator_ip_http_result_bandwidth_input > 5000000
          for: 5m
          labels:
            severity: warning
            domain: "{{ $labels.domain_name }}"
          annotations:
            summary: Большой входящий трафик на {{ $labels.domain_name }}
            description: На домене {{ $labels.domain_name }} в Qrator фиксируется повышенный входящий трафик, более 5Мбит/с
        - alert: QratorHighBandwidthOutput
          expr: qrator_ip_http_result_bandwidth_input > 5000000
          for: 5m
          labels:
            severity: warning
            domain: "{{ $labels.domain_name }}"
          annotations:
            summary: Большой исходящий трафик на {{ $labels.domain_name }}
            description: На домене {{ $labels.domain_name }} в Qrator фиксируется повышенный исходящий трафик, более 5Мбит/с
        - alert: QratorHigh5xxRate
          expr: qrator_http_http_result_errors_total >= 0.1
          for: 5m
          labels:
            severity: critical
            domain: "{{ $labels.domain_name }}"
          annotations:
            summary: В Qrator на {{ $labels.domain_name }} фиксируется рост числа ошибок
            description: В Qrator на домене {{ $labels.domain_name }} в течении 5 минут фиксируется рост числа 50x ошибок

Перед добавлением алертов стандартная рекомендация – пособирайте некоторые время метрики, чтобы определить для себя граничные значения, удобнее всего за этим наблюдать в Grafana, поэтому в качестве базового можно взять этот дашборд:

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "fiscalYearStartMonth": 0,
  "gnetId": null,
  "graphTooltip": 1,
  "id": 106,
  "iteration": 1647973063127,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "datasource": null,
      "description": "Alerts:\n\n* QratorHighBandwidthInput\n",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "bits"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "options": {
        "legend": {
          "calcs": [
            "max"
          ],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "multi"
        }
      },
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(qrator_ip_http_result_bandwidth_input{domain_name=\"$domain\"})",
          "interval": "",
          "legendFormat": "input",
          "refId": "A"
        },
        {
          "exemplar": true,
          "expr": "sum(qrator_ip_http_result_bandwidth_output{domain_name=\"$domain\"})",
          "hide": false,
          "interval": "",
          "legendFormat": "output",
          "refId": "B"
        }
      ],
      "title": "Traffic",
      "type": "timeseries"
    },
    {
      "datasource": null,
      "description": "Alerts:\n\n* QratorHigh5xxRate",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "decimals": 2,
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "reqps"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 12,
        "y": 0
      },
      "id": 4,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "multi"
        }
      },
      "targets": [
        {
          "exemplar": true,
          "expr": "sum({__name__=~\"qrator_http_http_result_errors_.+\", domain_name=\"$domain\"})by(__name__)",
          "interval": "",
          "legendFormat": "{{ __name__ }}",
          "refId": "A"
        }
      ],
      "title": "Errors",
      "transformations": [
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_errors_(.*)",
            "renamePattern": "$1"
          }
        }
      ],
      "type": "timeseries"
    },
    {
      "datasource": null,
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "reqps"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 9
      },
      "id": 7,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(qrator_http_http_result_requests{domain_name=\"$domain\"})",
          "interval": "",
          "legendFormat": "total",
          "refId": "A"
        }
      ],
      "title": "Requests",
      "transformations": [
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0000_0(.*)",
            "renamePattern": "Less $1 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0(.*)_0(.*)",
            "renamePattern": "$1 - $2 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0(.*)_(.*)",
            "renamePattern": "$1 - $2 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_1000_1500",
            "renamePattern": "1 - 1.5 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_1500_2000",
            "renamePattern": "1.5 - 2 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_2000_5000",
            "renamePattern": "2 - 5 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_5000_inf",
            "renamePattern": "More 5 s"
          }
        }
      ],
      "type": "timeseries"
    },
    {
      "datasource": null,
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "reqps"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 12,
        "y": 9
      },
      "id": 10,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "exemplar": true,
          "expr": "sum({__name__=~\"qrator_http_http_result_responses_.+\", domain_name=\"$domain\"})by(__name__)",
          "interval": "",
          "legendFormat": "{{ __name__ }}",
          "refId": "A"
        }
      ],
      "title": "Requests by response time",
      "transformations": [
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0000_0(.*)",
            "renamePattern": "Less $1 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0(.*)_0(.*)",
            "renamePattern": "$1 - $2 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0(.*)_(.*)",
            "renamePattern": "$1 - $2 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_1000_1500",
            "renamePattern": "1 - 1.5 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_1500_2000",
            "renamePattern": "1.5 - 2 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_2000_5000",
            "renamePattern": "2 - 5 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_5000_inf",
            "renamePattern": "More 5 s"
          }
        }
      ],
      "type": "timeseries"
    },
    {
      "datasource": null,
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "pps"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 18
      },
      "id": 11,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "multi"
        }
      },
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(qrator_ip_http_result_packets_input{domain_name=\"$domain\"})",
          "interval": "",
          "legendFormat": "input",
          "refId": "A"
        },
        {
          "exemplar": true,
          "expr": "sum(qrator_ip_http_result_packets_output{domain_name=\"$domain\"})",
          "hide": false,
          "interval": "",
          "legendFormat": "output",
          "refId": "B"
        }
      ],
      "title": "Packets",
      "transformations": [
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0000_0(.*)",
            "renamePattern": "Less $1 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0(.*)_0(.*)",
            "renamePattern": "$1 - $2 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_0(.*)_(.*)",
            "renamePattern": "$1 - $2 ms"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_1000_1500",
            "renamePattern": "1 - 1.5 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_1500_2000",
            "renamePattern": "1.5 - 2 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_2000_5000",
            "renamePattern": "2 - 5 s"
          }
        },
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_http_http_result_responses_5000_inf",
            "renamePattern": "More 5 s"
          }
        }
      ],
      "type": "timeseries"
    },
    {
      "datasource": null,
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 12,
        "y": 18
      },
      "id": 5,
      "options": {
        "legend": {
          "calcs": [
            "max",
            "last"
          ],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "exemplar": true,
          "expr": "sum({__name__=~\"qrator_locations_http_result_locations_.+\", domain_name=\"$domain\"}>0)by(__name__)",
          "interval": "",
          "legendFormat": "{{ __name__ }}",
          "refId": "A"
        }
      ],
      "title": "Black list",
      "transformations": [
        {
          "id": "renameByRegex",
          "options": {
            "regex": "qrator_locations_http_result_locations_(.*)",
            "renamePattern": "$1"
          }
        }
      ],
      "type": "timeseries"
    }
  ],
  "schemaVersion": 32,
  "style": "dark",
  "tags": [
    "WIP"
  ],
  "templating": {
    "list": [
      {
        "allValue": null,
        "current": {
          "selected": false,
          "text": "qlean.ru",
          "value": "qlean.ru"
        },
        "datasource": null,
        "definition": "label_values(qrator_http_http_id, domain_name)",
        "description": null,
        "error": null,
        "hide": 0,
        "includeAll": false,
        "label": "Domain",
        "multi": false,
        "name": "domain",
        "options": [],
        "query": {
          "query": "label_values(qrator_http_http_id, domain_name)",
          "refId": "StandardVariableQuery"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 1,
        "type": "query"
      }
    ]
  },
  "time": {
    "from": "now-12h",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "",
  "title": "Qrator",
  "uid": "gM2arMHnk",
  "version": 23
}

TODO: Перенести дашборд в https://grafana.com/grafana/dashboards/.