Oracle on Windows: ASM instance terminated by LMON / ORA-27300 IPC_TCPConnectCheck failed with status -1

Recently I had an issue with a two-node Grid Infrastructure on Windows 2012R2. After an infrastructure-caused cluster restart (irresponsible SAN hardware patching :) ), everything was running on Node 2, and Node 1 could not join the cluster any more.

No easy solution: On CSSD level, there was no issue (network and disk heartbeat worked, according to ocssd.log). It turned out, that the ASM instance on Node 1 started, but its LMON could not communicate with the already-running ASM on Node 2: Instance terminated by LMON. No really speaking ORA error messages in its alert log.

But on the working Node 2, the ASM alert log shows
ORA-27300: OS system dependent operation:IPC_TCPConnectCheck failed with status: -1

Guessing from the module name, I started thinking about the network – and yes, somebody activated the Windows Firewall on Node 1. Strange that the errors did not show up on the node causing the error, but I was glad to have found the culprit.

How to check Windows firewall:
netsh advfirewall show currentprofile

Syntax that will always help with annoying firewalls, but has to be clarified by security:
netsh advfirewall set currentprofile state off

Lessons learned:

  1. People tend to introducing new problems during fixing others (in this case, messing with the Windows Firewall config during looking for a SAN problem), so DBAs, adapt your thinking to that.
  2. Obvious, but easy to forget: When diagnosing RAC / Clusterware issues, look into logs on all nodes (or build a central ADR)

Us usual, take care and think about the (other) box
Martin




You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply