As a Database Administrator, Monitoring of database is the most important part. As all we know Prevention is better than cure,
So. I am in this article comes with Replication Monitoring.
In replication monitoring, mainly we need to look into three things “sql_thread_running”, “io_thread_running” and the lag between master and slave.
The below script will help you in replication monitoring.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
#!/bin/bash #change this in order to indicate which slave you are monitoring now server_name="server1@example.com" #email of the database administrator to recieve notification admin_email='dba@example.com' admin_email1='manager@example.com' # Add mysql credentials in the following commands. sql_thread_running=$(mysql -uuser -ppassword -e "show slave status\G" | awk -F":" '/Slave_SQL_Running/ { print $2 }' | tr -d ' ') io_thread_running=$(mysql -uuser -ppassword -e "show slave status\G" | awk -F":" '/Slave_IO_Running/ { print $2 }' | tr -d ' ') seconds_late=$(mysql -uuser -ppassword -e "show slave status\G" | awk -F":" '/Seconds_Behind_Master/ { print $2 }' | tr -d ' ') seconds_late=$(($seconds_late+0)) if [ "$sql_thread_running" = "No" ] || [ "$io_thread_running" = "No" ] || [ $seconds_late -gt 3600 ]; then log_file="/tmp/log_slave_status_$(date +%m-%d-%Y-%H:%M)" echo "Replication is stopped between master and (Slave) on $(date +%m-%d-%Y-%H:%M)" >> $log_file if [ "$sql_thread_running" = "No" ]; then echo "-SQL Thread not running" >> $log_file fi if [ "$io_thread_running" = "No" ]; then echo "-IO thread not running" >> $log_file fi if [ $seconds_late -gt 3600 ]; then #formattting how the latency of the slave behind master should be displayed display_late="$seconds_late seconds" if [ $seconds_late -gt 60 ]; then display_late="$display_late = $(($seconds_late/60)) minutes" fi if [ $seconds_late -gt 3600 ]; then display_late="$display_late = $(($seconds_late/3600)) hours" fi echo "slave is behind master by $display_late" >> $log_file fi #echo send alerts to the admin mail -s "[Replication stop ALERT] Replication is stopped between master and Slave on $(date +%m-%d-%Y-%H:%M)" $admin_email $admin_email1 < $log_file echo "Slave not running, alerts sent to the admins..." else echo "slave is running normally, no problem detected :)" fi |
Thank You.