Oracle Clusterware 11.2: ASM crashes at startup

These days, a customer’s Oracle Clusterware (2 nodes) crashed one ASM instance at every startup.

More Facts:

  • It was not possible to start it manually, too.
  • The CSSD was running.
  • For obvious reasons, CRSD did not start.
  • The other ASM instance in the cluster recognized CLUSTER RECONFIGURATION for a short period of time.

The ASM Alert Log file looked like:

Sun Nov 13 13:44:08 2011
 MMNL started with pid=21, OS id=7783
 lmon registered with NM - instance number 2 (internal mem no 1)
 Sun Nov 13 13:46:05 2011
 System state dump requested by (instance=2, osid=7684 (PMON)),
         summary=[abnormal instance termination].
 System State dumped to trace file /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_diag_7706.trc
 Sun Nov 13 13:46:05 2011
 PMON (ospid: 7684): terminating the instance due to error 481
 Dumping diagnostic data in directory=[cdmp_20111113134605], requested by (instance=2, osid=7684 (PMON)),
         summary=[abnormal instance termination].
 Instance terminated by PMON, pid = 7684

Strange problem. Looking up device permissions, read write tests, rebooting the cluster in a downtime window – nothing.

To make a long story short: The NTP daemon did not get his time synchronisation, but was running. Thus, CTSS was in observer mode, and server time started drifting apart. Fixing NTP, fixed the cluster.

Nota bene
Martin




You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply