Right, I just spent 10 minutes looking for documentation that doesn't involve shitty expensive SaaS/PaaS, couldn't find anything. That disqualifies it for me as well, sorry for wasting your time.
I'll keep watching this thread, relevant to my interests as well. At work we let ansible (in pull mode) handle the Linux fleet, Android we don't have enough devices to bother, and are looking towards jamf for macs. But I'd love to find a FOSS solution too, our requirements are simple enough (as you said install/remove stuff, change basic settings)
My prod and testing environments are 2 libvirt VMs on the same hypervisor. They run the same services, deployed and managed by ansible. The testing VM just gets less disk/CPU/RAM resources, and is powered off most of the time. Simple config changes? Straight to prod. New feature, risky change? Testing first.
Data loss is not a problem specific to self-hosting.
Whenever you administrate a system that contains valuable data (a self-hosted network service/application, you personal computer, phone...), think about a backup and recovery strategy for common (and less common) data loss cases:
you delete a valuable file by accident
a bad actor deletes or encrypts the data (ransomware)
the device gets stolen, or destroyed (hardware failure, power surge, fire, flood, hosting provider closing your account)
anything you can think of
For these different scenarios try to find a working backup/restore strategy. For me they go like
Automatic, daily local backups (anything on my server gets backed up once a day to a backups directory using rsnapshot). Note that file sync like nextcloud won't protect you against this risk, if you delete a file on the nextcloud client it's also gone on the Nextcloud server (though there is a recycle bin). Local backups are quick and easy to restore after a simple mistake like this. They wont protect you against 2 and 3.
Assuming an attacker gains access to your machine they will also destroy or encrypt your local backups. My strategy against this is to pull a copy of the latest local backup, weekly, to a USB drive, through another computer, using rsync/rsnapshot. Then I unplug the USB drive, store it somewhere safe outside my home, and plug in a second USB drive. I rotate the drives every week (or every 2 weeks when I'm lazy - I have set up a notification to nag me to rotate the drive every saturday, but I sometimes ignore it)
The USB strategy also protects me against 3. If both my server and main computer burn down, the second drive is still out there, safely encrypted. It's the worst case scenario, I'd probably spend quite some time setting up everything again (though most of the setup is automated), and at this point I'd have bigger problems like, you know, burned down house. But I'd still have my data.
There are other strategies, tools, etc, this one works for me. It's cheap (the USB drives are a one-time investment), the only manual step is to rotate the drives every week or so.
If you're interested I wrote a quick HOWTO to migrate TT-RSS data from Mysql to Postgres a while ago. Ctrl+F search for Migrating tt-rss data to Postgresql from a MySQL-based installationhere
I still use that same migrated database 4 years later
third party software: subscribe to the releases RSS feed (in tt-rss or rss2email), read release notes, bump version number in my ansible playbook, run playbook, done.
step 2: stop your containers or just wait for them to crash/stop unnoticed for some reason
step 3: run docker system prune --all as one should do periodically to clean up the garbage docker leaves on your system. Lose all your data (this will delete even named volumes if they are not in use by a running container)
step 4: never use named or anonymous volumes again, use bind mounts
The fact that you absolutely need to run docker system prune --all regularly to get rid of GBs of unused layers, test containers, etc, combined with the fact that it deletes explicitely named volumes makes them too unsafe for my taste. Just use bind mounts.
One has a total powered-on time of 51534 hours, and the other 49499 hours.As for their actual age (manufacturing date), the only way to know is to look at the sticker on the drive, or find the invoice, can't tell you right now.
simple: rsyslog: all local logs to a central syslog file (using the imfile module), all syslogsfrom all server to a central rsyslog server (over TCP/SSL, example here). Use lnav or something similar to consume the logs
more complex, resource-heavy: Graylog Open as a replacement for the central rsyslog server, setup pipelines/alerts/whatever... Currently considering replacing my Graylog instance with Wazuh but I don't know yet if it will be able to replace it completely for me
with containers, software maintainers also need to keep their image up-to-date with latest security fixes (most of them don't) - whereas these are usually handled by unattended-upgrades or similar in a VM. Then put out a new release and expect users to upgrade ASAP. Or rebuild and encourage redeploying the latest image every day or so, which is bad for other reasons (no warning for breaking changes, the software must be tested thoroughly after every commit to master).
In short this adds the burden of proper OS/image maintenance for developers, something usually handled by distro maintainers.
trivy is helpful in assessing the maintenance/vulnerability level of OCI images.
There is a pinned post for this https://lemmy.world/post/60585