How to solve "Unable to accept reconnecting process EngineNode" error in nuoadmin.log ?

{question}When using Kubernetes deployment with Helm Charts, I found that:

  • SM/TE process is not started at all although there's a pod/container is still running.
  • Running pod/container is not associated with any process that is visible in nuocmd show domain, and the admin process refuses to accept reconnections from it.
  • There's repetitive error message in nuoadmin.log as shown below:
    2021-07-07T14:29:56.679+0000 WARN com.nuodb.host.ProcessManager tagServerExecutor31-thread-30 Unable to accept reconnecting process EngineNode{databaseName=xxxx, address=xxxx, port=48006, type=SM, pid=46, state=RUNNING, nodeId=3, version=4.1.2-5, ipAddress=xxxx, hostname=xxx}: Expected to find durable process with startId=xxx

{question}
{answer}
The root cause:

  • SM Pod associated with the process startId=xxx has been forcibly evicted from the domain state by Admin Process.
  • It becomes no longer in the domain as it has been disconnected from the Admin process but in fact it doesn't terminate.
  • As SM Pod is still running, it prevents itself or other SM pods from being rescheduled.
  • From Admin Process prospective, there's no running SM for the TE to join through and it can't reschedule new SM pod as there's unknown SM Pod is still running.

Workaround:

  • Set evictUnknownProcesses: true in admin/values.yaml where admin/values.yaml is Values file used in Admin process deployment by NuoDB Helm Charts.
  • This will force any Unknown Processes from Admin process prospective to be evicted and terminate & will let other SM/TE pods to be scheduled and start{answer}

Note: Starting from NuoDB Helm Charts v3.3, by default evictUnknownProcesses will be set to true.

Have more questions? Submit a request

Comments