Overnight, my lovely hosting company decided that it was time to move me to a new server of their own infrastructure. That’s fine, they’d sent me an email a few weeks ago, outlining the move saying nothing would happen.
“During the migration, your VPS will be temporarily shut down to avoid any data loss and then started once the migration has been completed.”
A message like that is too easy. Basically, don’t worry. So, I didn’t. Anyway, the move happened and the VPS came back up… everything except the database.
After chasing my tail for a few hours, I’ve managed to bring it back up. So, the question is, what went wrong, and why?
Running the command # systemctl status mariadb.service gave me the following results:
[root@server x]# systemctl status mariadb.service ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2022-01-13 10:40:40 AEDT; 22s ago Process: 4515 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=1/FAILURE) Process: 4514 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS) Process: 4479 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS) Main PID: 4514 (code=exited, status=0/SUCCESS) Jan 13 10:40:39 server.x.com systemd: Starting MariaDB database server... Jan 13 10:40:39 server.x.com mariadb-prepare-db-dir: Database MariaDB is probably initialized in /var/lib/mysql already, nothing is done. Jan 13 10:40:39 server.x.com mariadb-prepare-db-dir: If this is not the case, make sure the /var/lib/mysql is empty before running mariadb-prepare-db-dir. Jan 13 10:40:39 server.x.com mysqld_safe: 220113 10:40:39 mysqld_safe Logging to '/var/log/mariadb/mariadb.log'. Jan 13 10:40:39 server.x.com mysqld_safe: 220113 10:40:39 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql Jan 13 10:40:40 server.x.com systemd: mariadb.service: control process exited, code=exited status=1 Jan 13 10:40:40 server.x.com systemd: Failed to start MariaDB database server. Jan 13 10:40:40 server.x.com systemd: Unit mariadb.service entered failed state. Jan 13 10:40:40 server.x.com systemd: mariadb.service failed. Hint: Some lines were ellipsized, use -l to show in full.
This wasn’t very helpful in a way, as my MySQL (MariaDB) just would not start.
So, I backed up my database tables. As it turns out, my database is massive, as it’s using the InnoDB format. It’s not something I’m totally across, but I may change that in the future.
Interestingly, I thought the VPS would have been a lot faster in it’s local copying that it actually was.
But, I continued to investigate.
There were some suggestions of people uninstalling and starting again and removing their database files and restoring from backup. This to me was not an option as yet.
There were also suggestions of running a force recovery and so on, but I didn’t want to touch that yet either, as I wanted the data preserved.
I did however come across a fantastic description of the ibdata1 file and how it works. https://serverfault.com/a/487534/331406 Which, helped me learn more about database tables and how the data is actually stored in the DB, rather than just in each database directory in the /var/mysql/* path.
As per that post:
The file ibdata1 is the system tablespace for the InnoDB infrastructure.
It contains several classes for information vital for InnoDB
Table Data Pages Table Index Pages Data Dictionary MVCC Control Data Undo Space Rollback Segments Double Write Buffer (Pages Written in the Background to avoid OS caching) Insert Buffer (Changes to Secondary Indexes)
Anyway, on to the bigger problem still, and how to solve it…
Here’s the log entries of the mardiadb.log file. As you can see, everything went according to plan. The host issued the shut down command, and it went fine. About a minute and a half, it came back up again and began the start up procedure.
Then, it all went wrong… and the server failed to start.
220113 0:39:59 [Note] /usr/libexec/mysqld: Normal shutdown 220113 0:39:59 [Note] Event Scheduler: Purging the queue. 0 events 220113 0:39:59 InnoDB: Starting shutdown... 220113 0:40:04 InnoDB: Shutdown completed; log sequence number 36329006249 220113 0:40:04 [Note] /usr/libexec/mysqld: Shutdown complete 220113 00:40:04 mysqld_safe mysqld from pid file /var/run/mariadb/mariadb.pid ended 220113 00:41:38 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 220113 0:41:38 [Note] /usr/libexec/mysqld (mysqld 5.5.68-MariaDB) starting as process 607 ... 220113 0:41:38 InnoDB: The InnoDB memory heap is disabled 220113 0:41:38 InnoDB: Mutexes and rw_locks use GCC atomic builtins 220113 0:41:38 InnoDB: Compressed tables use zlib 1.2.7 220113 0:41:38 InnoDB: Using Linux native AIO 220113 0:41:38 InnoDB: Initializing buffer pool, size = 128.0M 220113 0:41:38 InnoDB: Completed initialization of buffer pool 220113 0:41:38 InnoDB: highest supported file format is Barracuda. 220113 0:41:38 InnoDB: Waiting for the background threads to start 220113 0:41:39 Percona XtraDB (http://www.percona.com) 5.5.61-MariaDB-38.13 started; log sequence number 36329006249 220113 0:41:39 [Note] Plugin 'FEEDBACK' is disabled. 220113 0:41:39 [Note] Server socket created on IP: '0.0.0.0'. 220113 0:41:39 [Note] Event Scheduler: Loaded 0 events 220113 0:41:39 [Note] /usr/libexec/mysqld: ready for connections. Version: '5.5.68-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server 220113 00:51:09 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 220113 0:51:11 [Note] /usr/libexec/mysqld (mysqld 5.5.68-MariaDB) starting as process 663 ... 220113 0:51:11 InnoDB: The InnoDB memory heap is disabled 220113 0:51:11 InnoDB: Mutexes and rw_locks use GCC atomic builtins 220113 0:51:11 InnoDB: Compressed tables use zlib 1.2.7 220113 0:51:11 InnoDB: Using Linux native AIO 220113 0:51:12 InnoDB: Initializing buffer pool, size = 128.0M 220113 0:51:12 InnoDB: Completed initialization of buffer pool 220113 0:51:12 InnoDB: highest supported file format is Barracuda. 220113 0:51:12 InnoDB: Starting crash recovery from checkpoint LSN=36329088581 InnoDB: Restoring possible half-written data pages from the doublewrite buffer... 220113 0:51:12 InnoDB: Error: page 212993 log sequence number 36329093879 InnoDB: is in the future! Current system log sequence number 36329089957. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: for more information.
At the last section shown in the log, there were corruption problems with something…
This is what gave me the solution I needed
There is inconsistency between the data files and the ib_logfile[0|1]-s. For me this can happen if I didn’t run –apply-log when using innobackupex. Try shut down the server, move the ib_logfiles into a different directory, and restart process. New logfiles will be created. (If it fails again, you can move back the original files)
After moving the two log files, I attempted to restart the server, and this time, it worked.
I moved 2 files to a temp location, restarted the service and it worked:
Updates: 2022 01 13 14:44
After the log file problem, the server started up. But, it only started momentarily, and the problems were back. Instead, I have resorted to starting the recovery modes: https://mariadb.com/kb/en/innodb-recovery-modes/
So, how do you fix it? It’s still taking me a while to make it work.
According to this page, they offered the following advice, which applied to me:
‘InnoDB: Waiting for the background threads to start’ you can also try adding the following to your my.cnf.
I presently have innodb_force_recovery = 6 and the purge command above, and the sql server has restarted successfully. However, the website hasn’t come back up yet, which is probably that the mysql server is fixing itself, as shown by the ‘top’ command.
Strangely enough, after getting a copy of all the backed up data to a local machine, I can confirm that the data is still safe.
I’m going to let this process run and see how long it takes. So far, it’s a long time, as my innoDB is sitting at about 758GB.
Further to this update.
In short, these different recovery modes sadly didn’t work for me and allow me to continue without doing a full restore. However, I was still able to recover all of the date prior to the database going down.
I decided to bite the bullet and kill the entire /var/lib/mysql path and ten rebuild and restore it.
After running the command below, I was able to get the DB to a state where it would start properly.
mysql_install_db --user=mysql --basedir=/usr --datadir=/var/lib/mysql
I also had to update the root password as well with the following:
'/usr/bin/mysqladmin' -u root password 'new-password'
I was then about to recreate the missing databases, and then load in the databases through the command line.
mysql -u username -p database_name < file.sql
I am now in the process of restoring the data from the recovered gzipped sql db files. A long way around, but so far so good.
After letting it run all night, the SQL import timed out, and It got half way through the dataset.
But, I’ve been able to at least get the website functioning again, and will load historical data next.
The data is loading again, and now I have some values to go by to see how well it’s loading. It’s pretty slow loading. But, I’m assuming this could be because it’s creating index values as it goes – but it’s only a guess.
So, in the end, I managed to get one DB back up successfully.
I will try for the second major one soon now too.