When you try to access the MGR GUI (SMSR), you notice that it is unreachable and that an internal server error is returned.
Some possible reasons for the MGR GUI unavailability are a lack of disk space or memory usage. To confirm if that is the case, please run the following commands on the MGR node:
ps -ef | grep -i mgr
ps -ef | grep -i httpd
The output of the df -kh command indicates if there is free disk space:
If the disk has indeed run out of space, proceed with the steps in the next section. Otherwise, please create a support ticket and include the output of the commands you have run so that the support team can guide you on the solution.
MGR Node Disk Has Run Out Of Space
It is possible that the ibdata1 file is using most of the disk space. That happens because MGR uses the InnoDB engine for MySQL. The ibdata1 file stores all the data and tablespace. As more data is entered into the database, the ibdata1 file grows. When some data is deleted from the DB, the ibdata1 file does not reduce in size. Instead, it retains unused space within it and when new data is added, it will allocate space from the unused space.
The steps to follow in cases like this are:
- Check if the stv_db_aggregate and stv_db_archive processes are enabled on the MGR system (they should run regularly via the cron). If they are not, check here what to do.
- Backup (dump) all MySQL databases (use mysqldump and backup all databases on another location). Check below the list of databases.
- Drop the existing databases (delete them from MySQL).
- Start MySQL, causing the ibdata file to be recreated with the default size of 10 MB. If MySQL fails to start, check here what to do instead of proceeding with the following steps.
- Delete the ibdata1 and iblog files.
- Recreate the databases.
- Dump the data back into the database (restore the backed-up databases).
The databases that must go through this process are:
What To Do When stv_db_aggregate and stv_db_archive Processes Are Disabled
If these processes were not running regularly, there could be a lot of stv_statistics to aggregate. Depending on the amount of data to aggregate, it could take a very long time to complete the process.
Two options are available in this case:
- Drop all stv_* databases and recreate them on MySQL. That would restore MGR quickly, but all STV statistics to date would be lost.
- Process the existing statistics before proceeding with the remaining steps. To do that:
- Stop STV data collection procedure so that new data is not stored in the database in order to avoid overhead.
- Change the polling interval from 1 minute to 1 hour to slow down the collection process in order to avoid overhead.
- Reduce data retention period for all statistics.
- Start the aggregation process and monitor. This step can take extremely long.
- Start the archive process and monitor.
- Once aggregation and archive are completed, dump the databases using mysqldump.
- Change the settings to point back to the original /data directory.
- Rebuild MySql Databases on /data directory and restore backed up (reduced data).
To help you decide which option to choose, if there are 300GB of data to aggregate, option 1 would take a few hours, while option 2 would take two weeks or more.
What To Do When MySQL Does Not Start
MySQL does not start when the disk space on /data is 100% used. If the ibdata1 file uses almost 100% of the space, a workaround must be followed so MySQL can start:
- Copy the /data/mysqldata/ directory to another node with enough disk space.
- Create the MGR and STV databases on that node.
- Mount an external USB disk to the server.
- Take a mysqldump of all MGR and STV databases on the external disk.
- Delete the ibdata1 and iblog files from the Production MGR server.
- Start MySQL with the new ibdata1 file.
- Connect the external disk (which contains a backup of all databases) to the MGR server.
- Restore all databases to the MGR server.