Feb 17, 2012

Opsview with Heartbeat(linux-ha) and drbd on Cent OS 5.5


after spending many hours in reading and reading on howto setup a fully functional working opsview instance on CentOS 5.5, i finally document everything ..


opsview is a product of http://www.opsview.com


INSTALL CentOS
We have two servers with two NICs. One NIC is for the heartbeat and the other to connect to the nework. We synchronize the Database via mysql-Replikation. First you need to Setup centOS on both servers. We parted our300GB harddisk like this. You can keep it like this.
Maybe you need more Space on the /-partition. Change it like you want.
/boot        256MB
/            100 GB
/swap        8192MB

Keep the Rest as Freespace. The Freespace is used afterwards for our drbd and our opsview installation . We keep it as Freespace beacause we create our drbd partition afterwards.
You cant create a drbd partition with a filesystem on it. Complete the CentOS Installation with “Server without GUI” Option.
Configure the Networkinterfaces during the Installation. You can change the settings after with this file.
/etc/sysconfig/network-scripts/ifcfg-eth0 and 1
DEVICE=eth0
BOOTPROTO=static
HWADDR=D8:23:42:E2:C4:48
IPADDR=10.20.20.1
NETMASK=255.255.255.0
ONBOOT=yes
DHCP_HOSTNAME=ops1
DEVICE=eth1
BOOTPROTO=static
HWADDR=D8:23:42:E2:C4:4A
IPADDR=10.10.10.1
NETMASK=255.255.255.0
ONBOOT=yes
DHCP_HOSTNAME=ops1

If you are behind a proxy you need to edit the yum.conf and wgetrc Configfiles. Add the following Line to /etc/yum.conf
proxy=http://proxyuser:proxypassword@www.yourproxy.net:8080

Add the following Lines to /etc/wgetrc
http_proxy = http://proxyuser:proxypassword@www.yourproxy.net:8080
use_proxy = on

Then we need to deactivate automatic yumupdates
chkconfig --del yum-updatesd

Turn off iptables(At your own risk!).
You can also adopt the rules to fit your needs.
We just turned it off, because we are behind a firewall.
service iptables save
service iptables stop
chkconfig iptables off

Disable SELinux at your own risk.
Open /etc/selinux/config with your editor and change the following line to
SELINUX=disabled

Update all installed packages
yum update

To keep out Time synchronized we install ntp.
yum install ntp

To start ntpd at Boot and sync one time now and start ntpd now, type the following.
chkconfig ntpd on
ntpdate www.yourtimeserver.net
/etc/init.d/ntpd start

Edit the /etc/ntp.conf
server www.yourtimeserver.net
driftfile /var/lib/ntp/drift
keys /etc/ntp/keys
logfile /var/log/ntp.log

We also set up a cronjob to keep the hwclock synced. Create this file and edit it. /etc/cron.hourly/timesync
#!/bin/sh
/sbin/hwclock --systohc > /dev/null 2>&1

Change the permissions for the cron-shell-skript
chmod u+x /etc/cron.hourly/timesync

Edit the /etc/hosts file, to tell the servers to use the heartbeat interface for heartbeat and drbd.
127.0.0.1               localhost.localdomain localhost
10.10.10.1              ops1
10.10.10.2              ops2

Create the drbd partition. Change the device to your devicename
fdisk /dev/cciss/c0d0np


Start Block default
End Block default

Reboot to reread the new partition table and Install drbd.
yum --enablerepo=extras install drbd83.x86_64 kmod-drbd83.x86_64 
Edit the /etc/drbd.conf File. The names of the hosts(ops1 and ops2) have to be the output of uname -n! c0d0p4 is the new created partiton. edit this to fit your needs. This file must be the same on both machines.
# syncer rate
common { syncer { rate 100M; } }
# recurso
resource r0 {
protocol C;
startup { wfc-timeout 60; degr-wfc-timeout     60; }
on ops1 {
device /dev/drbd0;
disk /dev/cciss/c0d0p4;
address 10.10.10.1:7789;
meta-disk internal;
}
on ops2 {
device /dev/drbd0;
disk /dev/cciss/c0d0p4;
address 10.10.10.2:7789;
meta-disk internal;
}
After this load the drbd Module and start the ressource.
modprobe drbd
drbdadm create-md r0
drbdadm up r0 
Make one of the two server the primary drbd-device. On this server issue the following command. This may take while. For a 300Gb partiton it took more then 9 hours.
drbdadm -- --overwrite-data-of-peer primary r0 
You can verify whats happening with this command.
cat /proc/drbd 
When the synchonisation is finished the output of cat /proc/drbd should look like this. Primary/secondary and uptodate/uptodate is important.
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:637594804 nr:156556 dw:637751360 dr:2061497 al:2582800 bm:56 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 
Now create a filesystem on the drbd-partition
mkfs.ext3 /dev/drbd0 
Install heartbeat
yum --enablerepo=extras install heartbeat.x86_64 
Edit/Create the /etc/ha.d/ha.cf. Again The hostnames must suit the output of uname -n. The ping command pings the gateway to see if the uplink is reachable. The ha.cf needs to be the same on both machines.
debug 0
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility     local0
keepalive       1
deadtime        10
warntime        5
initdead        60
udpport     694
auto_failback   off
node    ops1
node    ops2
ping    10.20.20.1
respawn hacluster /usr/lib64/heartbeat/ipfail
bcast   eth1
crm no 
Create a mount-point for the drbd partition.
mkdir /mnt/shared/ 
Edit/Create the /etc/ha.d/haresources file. First only add the httpd Service. After the opsviewinstallation you have to add the opsvew-services(opsview opsview-web opsview-agent). If you add a service which cant be started via /etc/init.d the heartbeat-process will fail. The Ipaddress 10.20.20.3 is the virtuel address shared. So ops1 is 10.20.20.1, ops2 is 10.20.20.2 and the virtual ip shared by both is 10.20.20.3
ops1 10.20.20.3 drbddisk::r0 filesystem::/dev/drbd0::/mnt/shared::ext3 httpd 
Edit/Create /etc/ha.d/authkeys
auth 1
1 crc 
Set the permissions.
chmod 600 /etc/ha.d/authkeys 
Start heartbeat to mount the drbd-device
/etc/init.d/heartbeat start 
Check the Output of /var/log/ha-debug to verify heartbeat is working. Check if the virtual IP appears in ifconfig output and check if you see the drbd device in df output. Then your heartbeat-config is working.

INSTALL Opsview

Only start the Installation on both machines if heartbeat is working and the drbd device is properly mounted on your primary machine.Otherwise the installation will work but you will have problems later. First create directories and Symlinks.
mkdir /mnt/shared/nagios
ln -s /mnt/shared/nagios /usr/local/nagios
mkdir /mnt/shared/nagios-home
ln -s /mnt/shared/nagios-home /var/log/nagios 
Install RPM-Forge
wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
rpm -Uhv rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm 
Configure/Create the Repos for Opsview. /etc/yum.repos.d/opsview.repo
[opsview]
name = Opsview
baseurl = http://downloads.opsera.com/opsview-community/latest/yum/centos/$releasever/$basearch
enabled = 1
protect = 0
gpgcheck = 0 
Now install Opsview via yum.
yum install opsview 
Add this line to your .bash_profile fiel in your root-directory
test -f /usr/local/nagios/bin/profile && . /usr/local/nagios/bin/profile 
Change to the nagios user and enter the following lines.
su nagios
cp /usr/local/nagios/etc/opsview.defaults  /usr/local/nagios/etc/opsview.conf
Edit the opsview.conf File and change the default passwords.
$dbpasswd = "passwd";
$odw_dbpasswd = "passwd";
$runtime_dbpasswd = "passwd";
$reports_dbpasswd = "passwd";

Start the mysql Service as root and change the mysql-root-password
chkconfig mysqld on
/etc/init.d/mysqld start
/usr/bin/mysqladmin -u root password 'yourmysqlrootpassword'

Run the Opsview Database installation scripts
/usr/local/nagios/bin/db_mysql -u root -p
/usr/local/nagios/bin/db_opsview db_install
/usr/local/nagios/bin/db_runtime db_install
/usr/local/nagios/bin/db_odw db_install
/usr/local/nagios/bin/db_reports db_install

Again as the nagios User generate the first config.
/usr/local/nagios/bin/rc.opsview gen_config

After you installed opsview you can edit the /etc/haresources File and add the opsview-services and restart heartbeat

MySQL Replication

Create Replication-User and grant Access.
mysql -u root -p
mysql>GRANT REPLICATION SLAVE ON *.* TO 'repl'@'ops1' IDENTIFIED BY 'yourslavepass';
mysql>GRANT REPLICATION SLAVE ON *.* TO 'repl'@'ops2' IDENTIFIED BY 'yourslavepass';

Add the following lines to your /etc/my.cf
log-bin=mysql-bin
expire_logs_days=3
server-id=1(on ops1) 2(on ops2)
innodb_flush_log_at_trx_commit=1
sync_binlog=1

After editing restart the mysql service. To synchronize the two databases for the first time(will be done automatically with the replication) do the following.
Get a mysqldump from your primary machine.
mysqldump -p --all-databases --master-data > opsview_dump.sql

Move this data via scp to the other machine.
scp opsview_dump.sql ops2:/root

Insert the dump on the second machine and start the Slave.
mysql -p < opsview_dump.sql
mysql>CHANGE MASTER TO
MASTER_HOST='ops1',
MASTER_USER='repl',
MASTER_PASSWORD='slavepass';
mysql>START SLAVE;
verify Mysql-replication on the slave with
mysql>show processlist;
mysql>show slave status \G;

It should look like this.
mysql> show processlist;
+--------+-------------+-----------+------+---------+--------+----------------------------------+------------------+
| Id     | User        | Host      | db   | Command | Time   | State                            | Info             |
+--------+-------------+-----------+------+---------+--------+----------------------------------+------------------+
|      2 | system user |           | NULL | Connect | 600278 | Waiting for master to send event | NULL             |
| 133423 | root        | localhost | NULL | Query   |      0 | NULL                             | show processlist |
+--------+-------------+-----------+------+---------+--------+----------------------------------+------------------+
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: ops1
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000216
Read_Master_Log_Pos: 571652841
Relay_Log_File: mysqld-relay-bin.000274
Relay_Log_Pos: 907145962
Relay_Master_Log_File: mysql-bin.000195
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Seconds_Behind_Master: NULL

After a failure. You just need to create a dump of the database on the old slave-host.
If the old Master is back again, transfer the dump and insert it to the old Master and make him Slave.




















No comments:

Post a Comment