GreenArrow Email Software Documentation

High Availability Cluster Administration

Overview

This page provides information on how to perform common administrative tasks on a High Availability Cluster. Its target audience is experienced Linux systems administrators. Please contact GreenArrow technical support if you’d like assistance.

Note for Managed Customers

As part of our Managed Services, we perform the tasks described on this page for you. You’re welcome to perform them on your end if desired, though.

Diagnostics

The following sequence can be used to verify that services are running normally on both servers.

  1. To verify that DRBD replication is working correctly and determine whether the server you’re consoled into is acting as the Primary or Secondary, run the following:

    drbd-overview
    

    You should see one or more lines of output similar to the following:

    1:r0/0  Connected Primary/Secondary UpToDate/UpToDate /media/drbd1 xfs 1.1T 333G 729G 32%
    

    Each line represents a DRBD device which holds a portion of GreenArrow’s data. Here are the key portions of each line to look at:

    • Connected means that data mirroring is active. This is the normal state. If you see some other state, then review DRBD’s connection states documentation to find out its meaning.

    • Primary/Secondary means that you’re currently SSHed into the Primary server, and it sees the Secondary server. If you were SSHed into the Secondary server, then you would instead see Secondary/Primary.

    • UpToDate/UpToDate means that the two servers have the same data on their storage devices. If you see a different value, then review DRBD’s disk states documentation to find out its meaning.

  2. To verify that GreenArrow’s services are running correctly on the server that’s currently acting as the Primary, run:

    service greenarrow status
    

    Any services with an abnormal state will be shown in red. The hvmail-qmail-smtpd3 and hvmail-dnscache services are down by default. All other services should be reported as up unless you shut them down intentionally.

  3. To verify that the Secondary server is connected to the Primary, and replicating PostgreSQL updates, run the following command on the Primary:

    /var/hvmail/postgres/default/bin/psql -U postgres -q -t -A -c "SELECT COUNT(1) FROM pg_stat_replication WHERE state = 'streaming'"
    

    The above command will tell you how many servers are streaming updates. Normally there will be one.

  4. To check how many seconds behind PostgreSQL replication is, run the following command on the Secondary server:

    /var/hvmail/postgres/default/bin/psql -U postgres -q -t -A -c "SELECT EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())"
    

    Replication will normally be a fraction of a second behind. If you intend to set up monitoring, then we recommend allowing for some momentary spikes, since the completion of a long-running query could take a number of seconds to propagate. PostgreSQL backups can also cause replication delays while they’re running.

Cron Jobs

Any cron jobs that should be run on the server that’s acting as Primary should be placed in the /etc/cron.d/greenarrow-active-node configuration file. Changes made to this file are automatically synchronized between servers.

Backup and Restore Procedures

You can take backups using the Unmanaged Backups script and/or have GreenArrow manage backups.

Restoration of High Availability Cluster functionality can only be performed by GreenArrow. There is backup restoration documentation, which you’re welcome to use, but please note that if you do, what you’ll end up with is a configuration which does not include a High Availability Cluster.