Hello selfhosters.
We all have bare-metal servres, VPS:es, containers and other things running. Some of them may be exposed openly to the internet, which is populated by autonomous malicious actors, and some may reside on a closed-off network since they contain sensitive data.
And there is a lot of solutions to monitor your servers, since none of us want our resources to be part of a botnet, or mine bitcoins for APTs, or simply have confidential data fall into the wrong hands.
Some of the tools I’ve looked at for this task are check_mk, netmonitor, monit: all of there monitor metrics such as CPU, RAM and network activity. Other tools such as Snort or Falco are designed to particularly detect suspicious activity. And there also are solutions that are hobbled together, like fail2ban actions together with pushover to get notified of intrusion attempts.
So my question to you is - how do you monitor your servers and with what tools? I need some inspiration to know what tooling to settle on to be able that detect unwanted external activity on my resources.
I’m a network guy, so everything in my labs use SNMP because it works with everything. Things that don’t support SNMP are usually replaced and yeeted off the nearest bridge.
For that I use librenms. Simple, open source, and I find it easy to use, for the most part. I put it on a different system than what I’m monitoring because if it shares fate with everything else, it’s not going to be very useful or give me any alerts if there’s a full outage of my main homelab cluster.
Of course, access from the internet to it, is forbidden, and any SNMP is filtered by my firewall. Nothing really gets through for it, so I’m unconcerned about it becoming a target. For the rest of my systems security is mostly reliant on a small set of reverse proxies and firewall rules to keep everything secure.
I use a couple of VPN systems to access the servers remotely, all running on odd ports (if they need port forwards at all). I have multiple to provide redundancy to my remote access, so if one VPN isn’t working due to a crash or something, I have others that should get me some measure of access.
I’m pretty old school, but as I only have 1 server, I just use
ssh
,df
,du
andtop
.Not even htop? That is old school.
Not even btop? That’s middle school.
Not even bottom? That’s elementary school.
Emoji reactions are missing 😂😂😂
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
Fewer Letters More Letters DNS Domain Name Service/System SSL Secure Sockets Layer, for transparent encryption VPN Virtual Private Network VPS Virtual Private Server (opposed to shared hosting)
3 acronyms in this thread; the most compressed thread commented on today has 7 acronyms.
[Thread #421 for this sub, first seen 10th Jan 2024, 14:55] [FAQ] [Full list] [Contact] [Source code]
Prometheus.
It’s open source, it’s easy to setup, its agents are available for nearly anything including OpenWrt, it can serve the simplest use case of “is it down” as well as much more complicated ones that stem from its ability to collect data over time.
Personally I’m monitoring:
- Is it up?
- Is the storage array healthy?
- Are the services I care about running?
I used to run it ephemerallly - wiping data on restart. Recently started persisting its data so I can see data over the longer run.
I’m running checkmk for monitoring but that won’t help you with detection of unwanted logins. For security I’m running crowded.
What’s crowded? I am having trouble searching for it because of its name
Prometheus for metrics
Loki for logs
Grafana for dashboards.
I use node exporter for host metrics (Proxmox/VMs/SFFs/RaspPis/Router) and a number of other *exporters:
- exportarr
- plex-exporter
- unifi-exporter
- bitcoin node exporter
I use the OpenTelemetry collector to collect some of the above metrics, rather than Prometheus itself, as well as docker logs and other log files before shipping them to Prometheus/Loki.
Oh, I also scrape metrics from my Traefik containers using OTEL as well.
Have you tried the proxmox exporter? I have tried it briefly for a grafana lab and it seemed pretty good.
I haven’t, but it looks like I’ve got another exporter to install and dashboard to create 😁
UptimeKuma is great, I use it for the simple “are my services up?” and is what I pay most attention to.
I still use zabbix for finer grained monitors though like checking raid status, smartctl, disk space, temperatures, etc.
I’ve been trying out librenms with more custom snmp checks too and am considering going that route instead of zabbix in the future
I don’t do much in the way of monitoring. I guess I should do that.