What checks should I perform before restarting the NuoDB database?
NuoDB is designed to keep running even with single failures, so restarting a database should be considered a last resort resolution for critical problems.
Checks to perform before bouncing a database:
Check that all nodes are up and connected:
- Run nuocmd show domain
- If a node is listed as UNREACHABLE, then fix that problem first, if at all possible;
Check that there are no hung NuoDB processes:
ps -ef | grep nuo
Check that any inconsistencies in the domain state are addressed:
- For example, any processes that are listed but don’t appear after running ps -ef | grep nuo
Check for any reported errors which would stop the database from starting correctly:
- For example, are there any errors reported by SMs regarding the archives?
Once the database is down, check the archive for consistency
- Use the nuoarchive command with the --repair option. More information here
- This can be done on an offline copy of the archive if the database needs to be brought up immediately;