Last month, one of my customers experienced a hard stop in their running application. This exception was unhandled and resulted in the system not processing data for approximately 12 hours. Fortunately, the rest of the system is properly decoupled and there was no loss of any data flowing in from the web.
Once I entered the application screen, I quickly saw the problem... One of the extremely important underlying application dependency services had stopped responding. Without the message broker running, no messages could get passed, the workers could not receive tasks to execute, the queues could not connect to the AMQP and the system stopped.
I immediately restarted the Ubuntu instance where this application had failed and once the system was backup, restarted the application.
The system went back to work and processed all of the data that was waiting.
The next step was to find a solution to monitor my message broker.
Then I read several posts about Monit, started to read the Monit manual and got started right away in the home lab.
HomeLab Sidebar: My Intel NUC7I3BNH is running ESXi 6.5 Hypervisor. I seriously love this fan-less architecture. The 32GB of memory and a 1 TB 970EVO M.2 SSD keeps this system humming. I deploy Ubuntu, Mint or Debian instances with ease.
I really like the Monit embedded web interface each Monit instance runs locally, but I manage dozens of servers. I need a central repository for my monitoring, especially in the cloud.
Ok, let's get started... My machine is running...
craig@precision-5810:~$ uname -a Linux precision-5810 4.15.0-38-generic #41~16.04.1-Ubuntu...
First, I downloaded Monit on my Precision 5520 and 5810, two Ubuntu instances running on the ESXi hypervisor and one of my many Raspberry Pi3's that runs my bitcoin mining software - cgminer.
craig@precision-5810:~$ sudo apt install monit
Next, I extracted this archive from the M/Monit site to my Apps folder. Created a startup app in Startup Apps so it starts at login.
This fires up the M/Monit web admin on localhost. Let's check our network stack.
craig@precision-5810:~$ netstat -tulpn
This shows us our network status.
Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name ... ... tcp 0 0 127.0.0.1:2812 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 4300/mmonit ... ...
Great. I see mmonit running on localhost at port 8080
Point your browser to http://localhost:8080/. Default install login is admin/swordfish. Change your admin password.
Next, configure your message server and authentication details. If necessary, add a new user. I'm leaving the default settings.
Turn your attention now to the BEST free and open source Linux monitoring tool available (imho)...
This app is easily configurable with a simple text based config file. That's it.
Let's open the current config file.
craig@precision-5810:~$ sudo nano /etc/monit/monitrc
This gist contains my default Monit config.
For this example, I'm going to monitor several standard Linux services on my Precision 5810. Those are: RabbitMQ, MySQL, Apache2 and Redis. I am including MongoDB in my example config because I will be monitoring MongoDB on several cloud server instances in AWS.
To apply the changes in the updated Monit config file, first check the syntax of the config file.
craig@precision-5810:~$ sudo monit -t Control syntax OK craig@precision-5810:~$ sudo service monit restart craig@precision-5810:~$ sudo monit restart all
Back in the M/Monit Admin,
Login with your username and password.
Click the bars and select Status
Detailed Precision 5810 Monit status:
Looks great. Events are being sent to my email. Every event is being logged.
What makes Monit really shine is the fact that these monitored services are now self-healing. In the event a service fails, you will be alerted and Monit will automatically restart the service. Up to a limited and configurable threshold in the Monit config file, of course. Had I been using Monit last month, when the message broker failed, I would have been alerted, the service would have restarted and no stoppage would have occurred.
Time to deploy my M/Monit and Monit to my client's instances.
SSH in to each of these instances one at a time. Update and install Monit. Paste each service type groups configs into /etc/monit/monitrc
I deploy M/Monit to one of the servers in the group. Fire up the M/Monit webserver.
Since this instance is now running on the Internet, we have to secure the M/Monit Web Admin with a SSH tunnel (which will encrypt all of our data).
Never expose user credentials to unsecure websites.
So true... so let's make sure we do that first. I like to run my SSH tunnels in a screen session. You already know I'm a huge fan of terminal multiplexing.
craig@precision-5810:~$ screen -dmS monit ssh -i /path/to/keyfile.pem -L 8088:localhost:8080 username@server_ip_address
If I run netstat again, I can see my newly created SSH tunnel on localhost port 8088 with pid 420.
craig@precision-5810:~$ netstat -tulpn Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name ... ... tcp 0 0 127.0.0.1:2812 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN 420/ssh ... ...
Awesome. I'm now forwarding my local port 8088 to the remote server's localhost port 8080; where my cloud M/Monit webserver instance is running.
Fire up your browser and point to http://localhost:8088/index.csp
Login and configure your server like earlier. (this session now occurs over the encrypted SSH tunnel; the username and password are not exposed in plain text on the Internet, the data is encrypted).
Setup messaging servers, message content, configure options and setup any necessary users.
Check out M/Monit's built in graphs and reporting tools!
The free and open source software program Monit is a great tool for monitoring your Linux instance's services, processes, filesystems, files, hosts and so much more. Configure to alert on memory usage; or create filesystem alerts when your instance runs low on hard drive space or memory.
Finally, I am also using Monit's check host feature. I may even stop using monitis in the near future; now that I have the ability to monitor frontend websites uptime availability myself. It's built into Monit.
... to check a website, append to any config...
check host hostname with address x.x.x.x if failed port 443 and request /about with content = "SimplePythonFoo()" then alert
Don't let uncaught and unhandled exceptions in your running applications ruin your morning. Grab a coffee and sit back and let Monit + M/Monit do the work.