Compare commits

...

9 Commits

Author SHA1 Message Date
Gregor Michels c9843a4cdd inventory: use /tmp as a the temporary dir on openwrt devices
* increases speed (in theory)
* conserve write cycles on the flash
2022-09-14 03:27:20 +02:00
Gregor Michels f0115625f6 monitoring: add end to end tests to monitor internet reachability
via imcp (blackbox exporter)

There are two exporters.
One lives inside `monitoring01` and uses the "normal" route into the
internet without a vpn (job: `e2e_default_v4`).

The other one lives inside `mon-e2e-clients01` and routes into the
internet via the vpn (job: `e2e_clients_v4`).
2022-09-14 03:12:22 +02:00
Gregor Michels 60e57af853 hypervisor: create new container "mon-e2e-clients01"
* lives inside the public network
* configured static lease on `gw-core01` for `mon-e2e-clients01`
* because of the policy-based-routing `mon-e2e-clients01` is not able to
  route into other network than the internet/wan. Jump via `gw-core01`
  if you want to reach this container
2022-09-14 03:11:05 +02:00
Gregor Michels bbfc548e23 rename playbook_provision_hyper01 -> playbook_provision_hypervisor 2022-09-14 03:01:41 +02:00
Gregor Michels 10d8e0133e monitoring: rollout node exporters on new inventory group "container"
Fixes: e350445a4b
2022-09-14 02:59:48 +02:00
Gregor Michels e539d6c36f pass: move container credentials into own folder 2022-09-14 02:58:42 +02:00
Gregor Michels e350445a4b playbook_provision_hyper01: generify playbook
now we read the containers to create dynamically from the inventory
2022-09-14 02:56:05 +02:00
Gregor Michels 24a31603ef monitoring: move node exporter installation into single task 2022-09-14 02:26:27 +02:00
Gregor Michels 6623cc0e09 monitoring: alert on node reboots 2022-09-14 02:16:15 +02:00
12 changed files with 208 additions and 58 deletions

View File

@ -7,6 +7,9 @@ ap-2bbf ip=10.84.1.30 channel_2g=11 channel_5g=149 # Tent 4
ap-1a38 ip=10.84.1.35 channel_2g=6 channel_5g=153 # Tent 5
ap-8f39 ip=10.84.1.37 channel_2g=1 channel_5g=157 # Tent 5
[accesspoints:vars]
ansible_remote_tmp=/tmp
[switches]
sw-access01 ip=10.84.1.11
sw-access02 ip=10.84.1.12
@ -14,9 +17,18 @@ sw-access02 ip=10.84.1.12
[gateways]
gw-core01 ip=10.84.1.1
[gateways:vars]
ansible_remote_tmp=/tmp
[server]
hyper01 ip=10.84.1.21
[vms]
eae-adp-jump01 ip=162.55.53.85 monitoring_ip=10.84.254.0 ansible_python_interpreter=/usr/local/bin/python3
monitoring01 ip=10.84.1.51
[container]
monitoring01 ip=10.84.1.51 cpus=2 disk=50 memory=1024 net='{"net0":"name=eth0,ip=10.84.1.51/24,gw=10.84.1.1,bridge=vmbr0"}'
mon-e2e-clients01 ip=10.84.7.30 cpus=1 disk=10 memory=256 net='{"net0":"name=eth0,ip=dhcp,bridge=vmbr1"}'
[container:vars]
ostemplate=local:vztmpl/debian-11-standard_11.3-1_amd64.tar.zst

View File

@ -9,21 +9,22 @@ Diagram:
IPAM / Device Overview:
-----------------------
| Name | Location | MGMT IPv4 | MAC | Device | Notes |
| --------------- | --------- | ------------- | ------------------- | -------------------- | ------------------------------------------------- |
| `gigacube-E950` | Büro | `192.168.0.1` | `c8:ea:f8:b6:e9:50` | ZTE MF289F/Gigacube | property of Saxonia Catering/rental from Vodafone |
| `gw-core01` | Büro | `10.84.1.1` | `78:8a:20:bd:b6:ae` | Ubiquiti EdgeRouterX | |
| `sw-access01` | Büro | `10.84.1.11` | `bc:cf:4f:e3:bb:8d` | Zyxel GS1800-8HP | |
| `sw-access02` | Zelt 5 | `10.84.1.12` | `bc:cf:4f:e3:ac:39` | Zyxel GS1800-8HP | |
| `hyper01` | Büro | `10.84.1.21` | `00:23:24:54:f0:fe` | Lenovo ThinkCentre ? | |
| `monitoring01` | `hyper01` | `10.84.1.51` | `16:b9:13:c3:10:5e` | Proxmox VM | |
| `ap-2bbf` | Zelt 4 | `10.84.1.30` | `24:de:c6:cc:2b:bf` | Aruba AP-105 | |
| `ap-1a38` | Zelt 5 | `10.84.1.35` | `24:de:c6:c3:ac:7c` | Aruba AP-105 | |
| `ap-0b99` | Zelt 2 | `10.84.1.32` | `6c:f3:7f:c9:0b:99` | Aruba AP-105 | |
| `ap-c5d1` | Büro | `10.84.1.33` | `ac:a3:1e:cf:c5:d1` | Aruba AP-105 | |
| `ap-c495` | Zelt 3 | `10.84.1.34` | `ac:a3:1e:cf:c4:95` | Aruba AP-105 | |
| `ap-8f42` | Zelt 1 | `10.84.1.36` | `d8:c7:c8:c2:8f:42` | Aruba AP-105 | |
| `ap-8f39` | Zelt 5 | `10.84.1.37` | `??:??:??:??:??:??` | Aruba AP-105 | |
| Name | Location | MGMT IPv4 | MAC | Device | Notes |
| ------------------- | --------- | ------------- | ------------------- | -------------------- | ------------------------------------------------- |
| `gigacube-E950` | Büro | `192.168.0.1` | `c8:ea:f8:b6:e9:50` | ZTE MF289F/Gigacube | property of Saxonia Catering/rental from Vodafone |
| `gw-core01` | Büro | `10.84.1.1` | `78:8a:20:bd:b6:ae` | Ubiquiti EdgeRouterX | |
| `sw-access01` | Büro | `10.84.1.11` | `bc:cf:4f:e3:bb:8d` | Zyxel GS1800-8HP | |
| `sw-access02` | Zelt 5 | `10.84.1.12` | `bc:cf:4f:e3:ac:39` | Zyxel GS1800-8HP | |
| `hyper01` | Büro | `10.84.1.21` | `00:23:24:54:f0:fe` | Lenovo ThinkCentre ? | |
| `monitoring01` | `hyper01` | `10.84.1.51` | `16:b9:13:c3:10:5e` | Proxmox Container | |
| `mon-e2e-clients01` | `hyper01` | `10.84.7.30` | `ca:ac:5a:d0:b6:02` | Proxmox Container | used for end to end monitoring of the public net |
| `ap-2bbf` | Zelt 4 | `10.84.1.30` | `24:de:c6:cc:2b:bf` | Aruba AP-105 | |
| `ap-1a38` | Zelt 5 | `10.84.1.35` | `24:de:c6:c3:ac:7c` | Aruba AP-105 | |
| `ap-0b99` | Zelt 2 | `10.84.1.32` | `6c:f3:7f:c9:0b:99` | Aruba AP-105 | |
| `ap-c5d1` | Büro | `10.84.1.33` | `ac:a3:1e:cf:c5:d1` | Aruba AP-105 | |
| `ap-c495` | Zelt 3 | `10.84.1.34` | `ac:a3:1e:cf:c4:95` | Aruba AP-105 | |
| `ap-8f42` | Zelt 1 | `10.84.1.36` | `d8:c7:c8:c2:8f:42` | Aruba AP-105 | |
| `ap-8f39` | Zelt 5 | `10.84.1.37` | `??:??:??:??:??:??` | Aruba AP-105 | |
Upstream Connectivity:

View File

@ -10,3 +10,12 @@ groups:
annotations:
summary: Prometheus target missing (instance {{ $labels.instance }})
description: "A Prometheus target has disappeared. An exporter might be crashed.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: NodeRebooted
expr: changes(node_boot_time_seconds[2h]) > 0
for: 0m
labels:
severity: critical
annotations:
summary: A node rebooted in the last 2 hours (instance {{ $labels.instance }})
description: "The uptime of a node changed in the last two hours. VALUE = {{ $value }}\n LABELS = {{ $labels }}"

36
files/blackbox.yml Normal file
View File

@ -0,0 +1,36 @@
modules:
http_2xx:
prober: http
http_post_2xx:
prober: http
http:
method: POST
tcp_connect:
prober: tcp
pop3s_banner:
prober: tcp
tcp:
query_response:
- expect: "^+OK"
tls: true
tls_config:
insecure_skip_verify: false
ssh_banner:
prober: tcp
tcp:
query_response:
- expect: "^SSH-2.0-"
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp_v4:
prober: icmp
icmp:
preferred_ip_protocol: ip4
ip_protocol_fallback: false

Binary file not shown.

View File

@ -27,16 +27,6 @@
mode: 0600
create: yes
- name: install node_exporter
package:
name: node_exporter
- name: enable node_exporter
service:
name: node_exporter
state: started
enabled: yes
handlers:
- name: reload firewall
command: pfctl -vf /etc/pf.conf

View File

@ -1,32 +0,0 @@
---
- name: provision hyper01
hosts: hyper01
tasks:
- name: install node-exporter
package:
name: prometheus-node-exporter
- name: create vms/container
hosts: 127.0.0.1
connection: local
gather_facts: no
tasks:
- name: create monitoring01
proxmox:
api_user: root@pam
api_password: "{{ lookup('passwordstore', 'server/hyper01') }}"
api_host: "{{ hostvars['hyper01']['ip'] }}"
node: hyper01
hostname: monitoring01
onboot: yes
cpus: 2
disk: 50
memory: 1024
storage: 'local-zfs'
ostemplate: 'local:vztmpl/debian-11-standard_11.3-1_amd64.tar.zst'
password: "{{ lookup('passwordstore', 'vms/monitoring01/root') }}"
pubkey: "{{ lookup('ansible.builtin.file', 'files/authorized_keys') }}"
netif: '{"net0":"name=eth0,ip=10.84.1.51/24,gw=10.84.1.1,bridge=vmbr0"}'
unprivileged: yes
features:
- nesting=1

View File

@ -0,0 +1,38 @@
---
- name: provision containers
hosts: 127.0.0.1
connection: local
gather_facts: no
vars:
proxmox_host: "hyper01"
tasks:
- name: create containers
proxmox:
api_user: root@pam
api_password: "{{ lookup('passwordstore', 'server/{{ proxmox_host }}') }}"
api_host: "{{ hostvars[proxmox_host]['ip'] }}"
node: "{{ proxmox_host }}"
hostname: "{{ item }}"
onboot: yes
cpus: "{{ hostvars[item]['cpus'] }}"
disk: "{{ hostvars[item]['disk'] }}"
memory: "{{ hostvars[item]['memory'] }}"
storage: 'local-zfs'
ostemplate: "{{ hostvars[item]['ostemplate'] }}"
password: "{{ lookup('passwordstore', 'container/{{ item }}/root') }}"
pubkey: "{{ lookup('ansible.builtin.file', 'files/authorized_keys') }}"
netif: "{{ hostvars[item]['net'] }}"
unprivileged: yes
features:
- nesting=1
with_items: "{{ groups['container'] }}"
- name: start containers
proxmox:
api_user: root@pam
api_password: "{{ lookup('passwordstore', 'server/{{ proxmox_host }}') }}"
api_host: "{{ hostvars[proxmox_host]['ip'] }}"
node: "{{ proxmox_host }}"
hostname: "{{ item }}"
state: started
with_items: "{{ groups['container'] }}"

View File

@ -1,4 +1,57 @@
---
- name: provision node exporters
hosts:
- server
- vms
- container
vars:
package_names:
OpenBSD: node_exporter
Debian: prometheus-node-exporter
tasks:
- name: install node exporter
package:
name: "{{ package_names[ansible_distribution] }}"
- name: start and enable node_exporter
service:
name: "{{ package_names[ansible_distribution] }}"
state: started
enabled: yes
- name: provision blackbox exporters
hosts:
- mon-e2e-clients01
- monitoring01
tasks:
- name: install blackbox exporter
package:
name: prometheus-blackbox-exporter
- name: add net raw capability to blackbox exporter
capabilities:
path: /usr/bin/prometheus-blackbox-exporter
capability: cap_net_raw+ep
notify:
- restart blackbox-exporter
- name: configure blackbox-exporter
copy:
src: files/blackbox.yml
dest: /etc/prometheus/blackbox.yml
owner: root
group: root
mode: 0644
validate: "prometheus-blackbox-exporter --config.file='%s' --config.check"
notify:
- restart blackbox-exporter
handlers:
- name: restart blackbox-exporter
service:
name: prometheus-blackbox-exporter
state: restarted
- name: provision monitoring
hosts:
- monitoring01

View File

@ -28,3 +28,46 @@ scrape_configs:
{% endfor %}
{% endfor %}
- job_name: 'blackbox'
static_configs:
- targets:
- {{ hostvars['mon-e2e-clients01']['ip'] }}:9115
- {{ hostvars['monitoring01']['ip'] }}:9115
- job_name: 'e2e_clients_v4'
metrics_path: /probe
params:
module: [icmp_v4]
static_configs:
- targets:
- freifunk-leipzig.de
- harald.brainpeach.de
- 195.201.165.118 # freifunk-leipzig.de without dns query
- 88.198.195.242 # harald.brainpeach.de without dns query
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: {{ hostvars['mon-e2e-clients01']['ip'] }}:9115
- job_name: 'e2e_default_v4'
metrics_path: /probe
params:
module: [icmp_v4]
static_configs:
- targets:
- 192.168.0.1 # gigacube
- freifunk-leipzig.de
- harald.brainpeach.de
- 195.201.165.118 # freifunk-leipzig.de without dns query
- 88.198.195.242 # harald.brainpeach.de without dns query
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: {{ hostvars['monitoring01']['ip'] }}:9115