News: Downtime on 14/01/2024 - Cause and Our Follow up
Dear JimatHosting user,
our vendor received an update request from Cloudlinux (our security provider), requesting an update to patch certain vulnerability. This also involved a kernel update which require a reboot at a later time.
Our vendor requested our permission first, to which we gave a green light.
As this is happening to all 15 server, 14 of them is running the update normally. Reboot is also happening normally.
On our last server, unexpected issue crops up upon rebooting, in which a hard disk is not detected upon reboot.
As our attempt to connect to our IPMI is not working, we have dispatched our engineer to the Data Center.
Upon arriving, our engineer noticed that the server is not up.
They have shut down the server for 5 minutes. And attempt to start it again.
The usual slot-out-slot-in is also executed to make sure hard disk connection is working correctly.
All the method above is not working.
Our enginner now have to restore that particular hard disk. And restore the grub.
Hard disk restoration is completed, however the setting for all the cpanel has been resetted, so our user can enter cPanel, however no website is up.
This prompt further check.
Our engineer noticed that the /home folder for all our user is jumbled up, some folder which is supposed to be in /home5 ended up in /home6
This means that our fstab is not mounting-mapping it correctly.
After all the fix, the server is now up.
Our Follow Up Solution
We believed this is the second time this issue happens with the same server within 1 month, which is not acceptable. We are in the middle of purchasing a new server and will be migrating all our client to the new server within 3-4 weeks.
Meanwhile, we will monitor the server closely.