| .. | ||
| backup | ||
| lib | ||
| motd | ||
| services | ||
| default.nix | ||
| README.md | ||
Homelab Configuration Documentation
Overview
This homelab configuration system provides a unified way to manage services across multiple nodes with automatic aggregation of monitoring, logging, backup, and reverse proxy configurations. The system is built on NixOS and follows a modular architecture with both local and global configuration scopes.
Core Homelab Options
Basic Configuration (homelab.*)
homelab = {
enable = true; # Enable homelab fleet configuration
hostname = "node-01"; # Hostname for this system
domain = "lab"; # Base domain for the homelab (default: "lab")
externalDomain = "procopius.dk"; # External domain to the homelab
environment = "production"; # Environment type: "production" | "staging" | "development"
location = "homelab"; # Physical location identifier
tags = ["web" "database"]; # Tags for this system
};
Monitoring System (homelab.monitoring.*)
homelab.monitoring = {
enable = true; # Enable monitoring system
# Node exporter (automatically enabled)
nodeExporter = {
enable = true; # Enable node exporter (default: true)
port = 9100; # Node exporter port (default: 9100)
};
# Manual metrics endpoints
metrics = [
{
name = "custom-app"; # Metric endpoint name
host = "localhost"; # Host (default: homelab.hostname)
port = 8080; # Port for metrics endpoint
path = "/metrics"; # Metrics path (default: "/metrics")
jobName = "custom"; # Prometheus job name
scrapeInterval = "30s"; # Scrape interval (default: "30s")
labels = { # Additional labels
component = "web";
};
}
];
# Manual health checks
healthChecks = [
{
name = "web-service"; # Health check name
host = "localhost"; # Host (default: homelab.hostname)
port = 80; # Port (nullable)
path = "/health"; # Health check path (default: "/")
protocol = "http"; # Protocol: "http" | "https" | "tcp" | "icmp"
method = "GET"; # HTTP method (default: "GET")
interval = "30s"; # Check interval (default: "30s")
timeout = "10s"; # Timeout (default: "10s")
conditions = [ # Check conditions
"[STATUS] == 200"
];
group = "web"; # Group name (default: "manual")
labels = {}; # Additional labels
enabled = true; # Enable check (default: true)
}
];
# Read-only aggregated data (automatically populated)
allMetrics = [...]; # All metrics from this node
allHealthChecks = [...]; # All health checks from this node
global = { # Global aggregation from all nodes
allMetrics = [...]; # All metrics from entire fleet
allHealthChecks = [...]; # All health checks from entire fleet
metricsByJobName = {...}; # Grouped by job name
healthChecksByGroup = {...}; # Grouped by group
summary = {
totalMetrics = 42;
totalHealthChecks = 15;
nodesCovered = ["node-01" "node-02"];
};
};
};
Logging System (homelab.logging.*)
homelab.logging = {
enable = true; # Enable logging system
# Promtail configuration
promtail = {
enable = true; # Enable Promtail (default: true)
port = 9080; # Promtail port (default: 9080)
clients = [ # Loki clients
{
url = "http://monitor.lab:3100/loki/api/v1/push";
tenant_id = null; # Optional tenant ID
}
];
};
# Log sources
sources = [
{
name = "app-logs"; # Source name
type = "file"; # Type: "journal" | "file" | "syslog" | "docker"
files = {
paths = ["/var/log/app.log"]; # File paths
multiline = { # Optional multiline config
firstLineRegex = "^\\d{4}-\\d{2}-\\d{2}";
maxWaitTime = "3s";
};
};
journal = { # Journal config (for type="journal")
path = "/var/log/journal";
};
labels = { # Additional labels
application = "myapp";
};
pipelineStages = []; # Promtail pipeline stages
enabled = true; # Enable source (default: true)
}
];
defaultLabels = { # Default labels for all sources
hostname = "node-01";
environment = "production";
location = "homelab";
};
# Read-only aggregated data
allSources = [...]; # All sources from this node
global = { # Global aggregation
allSources = [...]; # All sources from entire fleet
sourcesByType = {...}; # Grouped by type
summary = {
total = 25;
byType = {...};
byNode = {...};
};
};
};
Backup System (homelab.backups.*)
homelab.backups = {
enable = true; # Enable backup system
# Backup jobs
jobs = [
{
name = "database-backup"; # Job name
backend = "restic-s3"; # Backend name (must exist in backends)
backendOptions = { # Backend-specific overrides
repository = "custom-repo";
};
labels = { # Additional labels
type = "database";
};
}
];
# Backend configurations (defined by imported modules)
backends = {
restic-s3 = {...}; # Defined in restic.nix
};
defaultLabels = { # Default labels for all jobs
hostname = "node-01";
environment = "production";
location = "homelab";
};
monitoring = true; # Enable backup monitoring (default: true)
# Read-only aggregated data
allJobs = [...]; # All jobs from this node
allBackends = [...]; # All backend names from this node
global = { # Global aggregation
allJobs = [...]; # All jobs from entire fleet
allBackends = [...]; # All backends from entire fleet
jobsByBackend = {...}; # Grouped by backend
summary = {
total = 15;
byBackend = {...};
byNode = {...};
uniqueBackends = ["restic-s3" "borgbackup"];
};
};
};
Reverse Proxy System (homelab.reverseProxy.*)
homelab.reverseProxy = {
enable = true; # Enable reverse proxy system
# Proxy entries
entries = [
{
subdomain = "app"; # Subdomain
host = "localhost"; # Backend host (default: homelab.hostname)
port = 8080; # Backend port
path = "/"; # Backend path (default: "/")
enableAuth = false; # Enable authentication (default: false)
enableSSL = true; # Enable SSL (default: true)
}
];
# Read-only aggregated data
allEntries = [...]; # All entries from this node
global = { # Global aggregation
allEntries = [...]; # All entries from entire fleet
entriesBySubdomain = {...}; # Grouped by subdomain
entriesWithAuth = [...]; # Entries with authentication
entriesWithoutAuth = [...]; # Entries without authentication
summary = {
total = 12;
byNode = {...};
withAuth = 5;
withoutAuth = 7;
};
};
};
Service Configuration Pattern
All services follow a consistent pattern with automatic monitoring, logging, and proxy integration.
Generic Service Structure (homelab.services.${serviceName}.*)
homelab.services.myservice = {
enable = true; # Enable the service
port = 8080; # Main service port
description = "My Service"; # Service description
# Monitoring integration (automatic when enabled)
monitoring = {
enable = true; # Enable monitoring (default: true when service enabled)
metrics = {
enable = true; # Enable metrics endpoint (default: true)
path = "/metrics"; # Metrics path (default: "/metrics")
extraEndpoints = [ # Additional metric endpoints
{
name = "admin-metrics";
port = 8081;
path = "/admin/metrics";
jobName = "myservice-admin";
}
];
};
healthCheck = {
enable = true; # Enable health check (default: true)
path = "/health"; # Health check path (default: "/health")
conditions = [ # Check conditions
"[STATUS] == 200"
];
extraChecks = [ # Additional health checks
{
name = "myservice-api";
port = 8080;
path = "/api/health";
conditions = ["[STATUS] == 200" "[RESPONSE_TIME] < 500"];
}
];
};
extraLabels = { # Additional labels for all monitoring
tier = "application";
};
};
# Logging integration (automatic when enabled)
logging = {
enable = true; # Enable logging
files = [ # Log files to collect
"/var/log/myservice/app.log"
"/var/log/myservice/error.log"
];
parsing = {
regex = "^(?P<timestamp>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}) (?P<level>\\w+) (?P<message>.*)";
extractFields = ["level"]; # Fields to extract as labels
};
multiline = { # Multiline log handling
firstLineRegex = "^\\d{4}-\\d{2}-\\d{2}";
maxWaitTime = "3s";
};
extraLabels = { # Additional labels
application = "myservice";
};
extraSources = [ # Additional log sources
{
name = "myservice-access";
type = "file";
files.paths = ["/var/log/myservice/access.log"];
}
];
};
# Reverse proxy integration (automatic when enabled)
proxy = {
enable = true; # Enable reverse proxy
subdomain = "myservice"; # Subdomain (default: service name)
enableAuth = false; # Enable authentication (default: false)
additionalSubdomains = [ # Additional proxy entries
{
subdomain = "myservice-api";
port = 8081;
path = "/api";
enableAuth = true;
}
];
};
# Service-specific options
customOption = "value"; # Service-specific configuration
};
Example Service Implementations
Prometheus Service
homelab.services.prometheus = {
enable = true;
port = 9090;
# Prometheus-specific options
retention = "15d"; # Data retention period
alertmanager = {
enable = true;
url = "alertmanager.lab:9093";
};
extraScrapeConfigs = []; # Additional scrape configs
extraAlertingRules = []; # Additional alerting rules
globalConfig = { # Prometheus global config
scrape_interval = "15s";
evaluation_interval = "15s";
};
extraFlags = []; # Additional command line flags
ruleFiles = []; # Additional rule files
# Automatic integrations
monitoring.enable = true; # Self-monitoring
logging.enable = true; # Log collection
proxy = {
enable = true;
subdomain = "prometheus";
enableAuth = true; # Admin interface needs protection
};
};
Gatus Service
homelab.services.gatus = {
enable = true;
port = 8080;
# Gatus-specific options
ui = {
title = "Homelab Status";
header = "Homelab Services Status";
link = "https://status.procopius.dk";
buttons = [
{ name = "Grafana"; link = "https://grafana.procopius.dk"; }
{ name = "Prometheus"; link = "https://prometheus.procopius.dk"; }
];
};
alerting = { # Discord/Slack/etc notifications
discord = {
webhook-url = "https://discord.com/api/webhooks/...";
default-alert = {
enabled = true;
failure-threshold = 3;
success-threshold = 2;
};
};
};
storage = { # Storage backend
type = "memory"; # or "postgres", "sqlite"
};
web.address = "0.0.0.0";
extraConfig = {}; # Additional Gatus configuration
# Automatic integrations
monitoring.enable = true;
logging.enable = true;
proxy = {
enable = true;
subdomain = "status";
enableAuth = false; # Status page should be public
};
};
Global Aggregation System
The homelab system automatically aggregates configuration from all nodes in your fleet, making it easy to have centralized monitoring and management.
How Global Aggregation Works
- Local Configuration: Each node defines its own services and configurations
- Automatic Collection: The system automatically collects data from all nodes using the
base.nixaggregator - Enhancement: Each collected item is enhanced with node context (
_nodeName,_nodeConfig, etc.) - Global Exposure: Aggregated data is exposed in
*.global.*options
Global Data Structure
# Available on every node with global data from entire fleet
homelab.monitoring.global = {
allMetrics = [ # All metrics from all nodes
{
name = "prometheus-main";
host = "monitor";
port = 9090;
# ... other fields
_nodeName = "monitor"; # Source node name
_nodeConfig = {...}; # Source node config
_fullAddress = "monitor:9090";
_metricsUrl = "http://monitor:9090/metrics";
}
# ... more metrics from other nodes
];
allHealthChecks = [...]; # All health checks from all nodes
metricsByJobName = { # Grouped by Prometheus job name
"prometheus" = [...];
"node" = [...];
};
healthChecksByGroup = { # Grouped by health check group
"services" = [...];
"infrastructure" = [...];
};
summary = {
totalMetrics = 42;
totalHealthChecks = 15;
nodesCovered = ["monitor" "web-01" "db-01"];
};
};
homelab.logging.global = {
allSources = [...]; # All log sources from all nodes
sourcesByType = {
"file" = [...];
"journal" = [...];
};
summary = {...};
};
homelab.backups.global = {
allJobs = [...]; # All backup jobs from all nodes
allBackends = [...]; # All backup backends from all nodes
jobsByBackend = {...};
summary = {...};
};
homelab.reverseProxy.global = {
allEntries = [...]; # All proxy entries from all nodes
entriesBySubdomain = {...};
entriesWithAuth = [...];
entriesWithoutAuth = [...];
summary = {...};
};
Using Global Data
Services like Prometheus and Gatus automatically use global data:
# Prometheus automatically scrapes ALL metrics from the entire fleet
services.prometheus.scrapeConfigs =
# Automatically generated from homelab.monitoring.global.allMetrics
# Gatus automatically monitors ALL health checks from the entire fleet
services.gatus.settings.endpoints =
# Automatically generated from homelab.monitoring.global.allHealthChecks
Integration Examples
Adding a New Service
- Create the service configuration:
homelab.services.myapp = {
enable = true;
port = 3000;
monitoring.enable = true;
logging.enable = true;
proxy = {
enable = true;
subdomain = "myapp";
};
};
- The system automatically:
- Adds metrics endpoint to Prometheus (fleet-wide)
- Adds health check to Gatus (fleet-wide)
- Configures log collection to Loki
- Sets up reverse proxy entry
- Exposes the service globally for other nodes
Multi-Node Setup
# Node 1 (monitor.nix)
homelab = {
hostname = "monitor";
services.prometheus.enable = true;
services.gatus.enable = true;
};
# Node 2 (web.nix)
homelab = {
hostname = "web-01";
services.nginx.enable = true;
services.webapp.enable = true;
};
# Node 3 (database.nix)
homelab = {
hostname = "db-01";
services.postgresql.enable = true;
services.redis.enable = true;
};
Result: Monitor node automatically discovers and monitors all services across all three nodes.
File Structure
homelab/
├── default.nix # Main homelab options and imports
├── lib/
│ ├── systems/ # Core system modules
│ │ ├── monitoring.nix # Monitoring aggregation
│ │ ├── logging.nix # Logging aggregation
│ │ ├── backups.nix # Backup aggregation
│ │ └── proxy.nix # Reverse proxy aggregation
│ ├── features/ # Service feature modules
│ │ ├── monitoring.nix # Service monitoring template
│ │ ├── logging.nix # Service logging template
│ │ └── proxy.nix # Service proxy template
│ └── aggregators/
│ └── base.nix # Base aggregation functions
└── services/ # Individual service implementations
├── prometheus.nix
├── gatus.nix
└── ...
This architecture provides a scalable, consistent way to manage a homelab fleet with automatic service discovery, monitoring, and management across all nodes.