homelab/modules/homelab
2025-07-30 00:22:33 +02:00
..
backup homelab framework module init (everything is a mess) 2025-07-28 02:05:13 +02:00
lib services... 2025-07-30 00:22:33 +02:00
motd homelab framework module init (everything is a mess) 2025-07-28 02:05:13 +02:00
services services... 2025-07-30 00:22:33 +02:00
default.nix auto docs 2025-07-29 16:28:17 +02:00
README.md auto docs 2025-07-29 16:28:17 +02:00

Homelab Configuration Documentation

Overview

This homelab configuration system provides a unified way to manage services across multiple nodes with automatic aggregation of monitoring, logging, backup, and reverse proxy configurations. The system is built on NixOS and follows a modular architecture with both local and global configuration scopes.

Core Homelab Options

Basic Configuration (homelab.*)

homelab = {
  enable = true;                    # Enable homelab fleet configuration
  hostname = "node-01";             # Hostname for this system
  domain = "lab";                   # Base domain for the homelab (default: "lab")
  externalDomain = "procopius.dk";  # External domain to the homelab
  environment = "production";       # Environment type: "production" | "staging" | "development"
  location = "homelab";             # Physical location identifier
  tags = ["web" "database"];       # Tags for this system
};

Monitoring System (homelab.monitoring.*)

homelab.monitoring = {
  enable = true;                    # Enable monitoring system

  # Node exporter (automatically enabled)
  nodeExporter = {
    enable = true;                  # Enable node exporter (default: true)
    port = 9100;                    # Node exporter port (default: 9100)
  };

  # Manual metrics endpoints
  metrics = [
    {
      name = "custom-app";           # Metric endpoint name
      host = "localhost";            # Host (default: homelab.hostname)
      port = 8080;                   # Port for metrics endpoint
      path = "/metrics";             # Metrics path (default: "/metrics")
      jobName = "custom";            # Prometheus job name
      scrapeInterval = "30s";        # Scrape interval (default: "30s")
      labels = {                     # Additional labels
        component = "web";
      };
    }
  ];

  # Manual health checks
  healthChecks = [
    {
      name = "web-service";          # Health check name
      host = "localhost";            # Host (default: homelab.hostname)
      port = 80;                     # Port (nullable)
      path = "/health";              # Health check path (default: "/")
      protocol = "http";             # Protocol: "http" | "https" | "tcp" | "icmp"
      method = "GET";                # HTTP method (default: "GET")
      interval = "30s";              # Check interval (default: "30s")
      timeout = "10s";               # Timeout (default: "10s")
      conditions = [                 # Check conditions
        "[STATUS] == 200"
      ];
      group = "web";                 # Group name (default: "manual")
      labels = {};                   # Additional labels
      enabled = true;                # Enable check (default: true)
    }
  ];

  # Read-only aggregated data (automatically populated)
  allMetrics = [...];              # All metrics from this node
  allHealthChecks = [...];         # All health checks from this node
  global = {                       # Global aggregation from all nodes
    allMetrics = [...];            # All metrics from entire fleet
    allHealthChecks = [...];       # All health checks from entire fleet
    metricsByJobName = {...};      # Grouped by job name
    healthChecksByGroup = {...};   # Grouped by group
    summary = {
      totalMetrics = 42;
      totalHealthChecks = 15;
      nodesCovered = ["node-01" "node-02"];
    };
  };
};

Logging System (homelab.logging.*)

homelab.logging = {
  enable = true;                    # Enable logging system

  # Promtail configuration
  promtail = {
    enable = true;                  # Enable Promtail (default: true)
    port = 9080;                    # Promtail port (default: 9080)
    clients = [                     # Loki clients
      {
        url = "http://monitor.lab:3100/loki/api/v1/push";
        tenant_id = null;           # Optional tenant ID
      }
    ];
  };

  # Log sources
  sources = [
    {
      name = "app-logs";            # Source name
      type = "file";                # Type: "journal" | "file" | "syslog" | "docker"
      files = {
        paths = ["/var/log/app.log"]; # File paths
        multiline = {               # Optional multiline config
          firstLineRegex = "^\\d{4}-\\d{2}-\\d{2}";
          maxWaitTime = "3s";
        };
      };
      journal = {                   # Journal config (for type="journal")
        path = "/var/log/journal";
      };
      labels = {                    # Additional labels
        application = "myapp";
      };
      pipelineStages = [];          # Promtail pipeline stages
      enabled = true;               # Enable source (default: true)
    }
  ];

  defaultLabels = {                 # Default labels for all sources
    hostname = "node-01";
    environment = "production";
    location = "homelab";
  };

  # Read-only aggregated data
  allSources = [...];              # All sources from this node
  global = {                       # Global aggregation
    allSources = [...];            # All sources from entire fleet
    sourcesByType = {...};         # Grouped by type
    summary = {
      total = 25;
      byType = {...};
      byNode = {...};
    };
  };
};

Backup System (homelab.backups.*)

homelab.backups = {
  enable = true;                    # Enable backup system

  # Backup jobs
  jobs = [
    {
      name = "database-backup";     # Job name
      backend = "restic-s3";        # Backend name (must exist in backends)
      backendOptions = {            # Backend-specific overrides
        repository = "custom-repo";
      };
      labels = {                    # Additional labels
        type = "database";
      };
    }
  ];

  # Backend configurations (defined by imported modules)
  backends = {
    restic-s3 = {...};             # Defined in restic.nix
  };

  defaultLabels = {                 # Default labels for all jobs
    hostname = "node-01";
    environment = "production";
    location = "homelab";
  };

  monitoring = true;                # Enable backup monitoring (default: true)

  # Read-only aggregated data
  allJobs = [...];                 # All jobs from this node
  allBackends = [...];             # All backend names from this node
  global = {                       # Global aggregation
    allJobs = [...];               # All jobs from entire fleet
    allBackends = [...];           # All backends from entire fleet
    jobsByBackend = {...};         # Grouped by backend
    summary = {
      total = 15;
      byBackend = {...};
      byNode = {...};
      uniqueBackends = ["restic-s3" "borgbackup"];
    };
  };
};

Reverse Proxy System (homelab.reverseProxy.*)

homelab.reverseProxy = {
  enable = true;                    # Enable reverse proxy system

  # Proxy entries
  entries = [
    {
      subdomain = "app";            # Subdomain
      host = "localhost";           # Backend host (default: homelab.hostname)
      port = 8080;                  # Backend port
      path = "/";                   # Backend path (default: "/")
      enableAuth = false;           # Enable authentication (default: false)
      enableSSL = true;             # Enable SSL (default: true)
    }
  ];

  # Read-only aggregated data
  allEntries = [...];              # All entries from this node
  global = {                       # Global aggregation
    allEntries = [...];            # All entries from entire fleet
    entriesBySubdomain = {...};    # Grouped by subdomain
    entriesWithAuth = [...];       # Entries with authentication
    entriesWithoutAuth = [...];    # Entries without authentication
    summary = {
      total = 12;
      byNode = {...};
      withAuth = 5;
      withoutAuth = 7;
    };
  };
};

Service Configuration Pattern

All services follow a consistent pattern with automatic monitoring, logging, and proxy integration.

Generic Service Structure (homelab.services.${serviceName}.*)

homelab.services.myservice = {
  enable = true;                    # Enable the service
  port = 8080;                      # Main service port
  description = "My Service";       # Service description

  # Monitoring integration (automatic when enabled)
  monitoring = {
    enable = true;                  # Enable monitoring (default: true when service enabled)

    metrics = {
      enable = true;                # Enable metrics endpoint (default: true)
      path = "/metrics";            # Metrics path (default: "/metrics")
      extraEndpoints = [            # Additional metric endpoints
        {
          name = "admin-metrics";
          port = 8081;
          path = "/admin/metrics";
          jobName = "myservice-admin";
        }
      ];
    };

    healthCheck = {
      enable = true;                # Enable health check (default: true)
      path = "/health";             # Health check path (default: "/health")
      conditions = [                # Check conditions
        "[STATUS] == 200"
      ];
      extraChecks = [               # Additional health checks
        {
          name = "myservice-api";
          port = 8080;
          path = "/api/health";
          conditions = ["[STATUS] == 200" "[RESPONSE_TIME] < 500"];
        }
      ];
    };

    extraLabels = {                 # Additional labels for all monitoring
      tier = "application";
    };
  };

  # Logging integration (automatic when enabled)
  logging = {
    enable = true;                  # Enable logging
    files = [                       # Log files to collect
      "/var/log/myservice/app.log"
      "/var/log/myservice/error.log"
    ];

    parsing = {
      regex = "^(?P<timestamp>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}) (?P<level>\\w+) (?P<message>.*)";
      extractFields = ["level"];   # Fields to extract as labels
    };

    multiline = {                   # Multiline log handling
      firstLineRegex = "^\\d{4}-\\d{2}-\\d{2}";
      maxWaitTime = "3s";
    };

    extraLabels = {                 # Additional labels
      application = "myservice";
    };

    extraSources = [                # Additional log sources
      {
        name = "myservice-access";
        type = "file";
        files.paths = ["/var/log/myservice/access.log"];
      }
    ];
  };

  # Reverse proxy integration (automatic when enabled)
  proxy = {
    enable = true;                  # Enable reverse proxy
    subdomain = "myservice";        # Subdomain (default: service name)
    enableAuth = false;             # Enable authentication (default: false)

    additionalSubdomains = [        # Additional proxy entries
      {
        subdomain = "myservice-api";
        port = 8081;
        path = "/api";
        enableAuth = true;
      }
    ];
  };

  # Service-specific options
  customOption = "value";           # Service-specific configuration
};

Example Service Implementations

Prometheus Service

homelab.services.prometheus = {
  enable = true;
  port = 9090;

  # Prometheus-specific options
  retention = "15d";                # Data retention period
  alertmanager = {
    enable = true;
    url = "alertmanager.lab:9093";
  };
  extraScrapeConfigs = [];          # Additional scrape configs
  extraAlertingRules = [];          # Additional alerting rules
  globalConfig = {                  # Prometheus global config
    scrape_interval = "15s";
    evaluation_interval = "15s";
  };
  extraFlags = [];                  # Additional command line flags
  ruleFiles = [];                   # Additional rule files

  # Automatic integrations
  monitoring.enable = true;         # Self-monitoring
  logging.enable = true;            # Log collection
  proxy = {
    enable = true;
    subdomain = "prometheus";
    enableAuth = true;              # Admin interface needs protection
  };
};

Gatus Service

homelab.services.gatus = {
  enable = true;
  port = 8080;

  # Gatus-specific options
  ui = {
    title = "Homelab Status";
    header = "Homelab Services Status";
    link = "https://status.procopius.dk";
    buttons = [
      { name = "Grafana"; link = "https://grafana.procopius.dk"; }
      { name = "Prometheus"; link = "https://prometheus.procopius.dk"; }
    ];
  };

  alerting = {                      # Discord/Slack/etc notifications
    discord = {
      webhook-url = "https://discord.com/api/webhooks/...";
      default-alert = {
        enabled = true;
        failure-threshold = 3;
        success-threshold = 2;
      };
    };
  };

  storage = {                       # Storage backend
    type = "memory";                # or "postgres", "sqlite"
  };

  web.address = "0.0.0.0";
  extraConfig = {};                 # Additional Gatus configuration

  # Automatic integrations
  monitoring.enable = true;
  logging.enable = true;
  proxy = {
    enable = true;
    subdomain = "status";
    enableAuth = false;             # Status page should be public
  };
};

Global Aggregation System

The homelab system automatically aggregates configuration from all nodes in your fleet, making it easy to have centralized monitoring and management.

How Global Aggregation Works

  1. Local Configuration: Each node defines its own services and configurations
  2. Automatic Collection: The system automatically collects data from all nodes using the base.nix aggregator
  3. Enhancement: Each collected item is enhanced with node context (_nodeName, _nodeConfig, etc.)
  4. Global Exposure: Aggregated data is exposed in *.global.* options

Global Data Structure

# Available on every node with global data from entire fleet
homelab.monitoring.global = {
  allMetrics = [                    # All metrics from all nodes
    {
      name = "prometheus-main";
      host = "monitor";
      port = 9090;
      # ... other fields
      _nodeName = "monitor";        # Source node name
      _nodeConfig = {...};          # Source node config
      _fullAddress = "monitor:9090";
      _metricsUrl = "http://monitor:9090/metrics";
    }
    # ... more metrics from other nodes
  ];

  allHealthChecks = [...];          # All health checks from all nodes
  metricsByJobName = {              # Grouped by Prometheus job name
    "prometheus" = [...];
    "node" = [...];
  };
  healthChecksByGroup = {           # Grouped by health check group
    "services" = [...];
    "infrastructure" = [...];
  };
  summary = {
    totalMetrics = 42;
    totalHealthChecks = 15;
    nodesCovered = ["monitor" "web-01" "db-01"];
  };
};

homelab.logging.global = {
  allSources = [...];               # All log sources from all nodes
  sourcesByType = {
    "file" = [...];
    "journal" = [...];
  };
  summary = {...};
};

homelab.backups.global = {
  allJobs = [...];                  # All backup jobs from all nodes
  allBackends = [...];              # All backup backends from all nodes
  jobsByBackend = {...};
  summary = {...};
};

homelab.reverseProxy.global = {
  allEntries = [...];               # All proxy entries from all nodes
  entriesBySubdomain = {...};
  entriesWithAuth = [...];
  entriesWithoutAuth = [...];
  summary = {...};
};

Using Global Data

Services like Prometheus and Gatus automatically use global data:

# Prometheus automatically scrapes ALL metrics from the entire fleet
services.prometheus.scrapeConfigs =
  # Automatically generated from homelab.monitoring.global.allMetrics

# Gatus automatically monitors ALL health checks from the entire fleet
services.gatus.settings.endpoints =
  # Automatically generated from homelab.monitoring.global.allHealthChecks

Integration Examples

Adding a New Service

  1. Create the service configuration:
homelab.services.myapp = {
  enable = true;
  port = 3000;
  monitoring.enable = true;
  logging.enable = true;
  proxy = {
    enable = true;
    subdomain = "myapp";
  };
};
  1. The system automatically:
    • Adds metrics endpoint to Prometheus (fleet-wide)
    • Adds health check to Gatus (fleet-wide)
    • Configures log collection to Loki
    • Sets up reverse proxy entry
    • Exposes the service globally for other nodes

Multi-Node Setup

# Node 1 (monitor.nix)
homelab = {
  hostname = "monitor";
  services.prometheus.enable = true;
  services.gatus.enable = true;
};

# Node 2 (web.nix)
homelab = {
  hostname = "web-01";
  services.nginx.enable = true;
  services.webapp.enable = true;
};

# Node 3 (database.nix)
homelab = {
  hostname = "db-01";
  services.postgresql.enable = true;
  services.redis.enable = true;
};

Result: Monitor node automatically discovers and monitors all services across all three nodes.

File Structure

homelab/
├── default.nix              # Main homelab options and imports
├── lib/
│   ├── systems/              # Core system modules
│   │   ├── monitoring.nix    # Monitoring aggregation
│   │   ├── logging.nix       # Logging aggregation
│   │   ├── backups.nix       # Backup aggregation
│   │   └── proxy.nix         # Reverse proxy aggregation
│   ├── features/             # Service feature modules
│   │   ├── monitoring.nix    # Service monitoring template
│   │   ├── logging.nix       # Service logging template
│   │   └── proxy.nix         # Service proxy template
│   └── aggregators/
│       └── base.nix          # Base aggregation functions
└── services/                 # Individual service implementations
    ├── prometheus.nix
    ├── gatus.nix
    └── ...

This architecture provides a scalable, consistent way to manage a homelab fleet with automatic service discovery, monitoring, and management across all nodes.