Archive for January, 2013

Oracle Grid Infrastructure / ASM – ERROR: failed to update diskgroup resource ora.DATA.dg

Recently, when installing a database on a grid infrastructure, dbca failed with the message that it wasn’t able to establish the dependency between the database and the corresponding grid infrastructure resource of the DATA ASM disk group I was using for my data files. Its name would have been “ora.DATA.dg”. (ERROR: failed to establish dependency between database MYDB and diskgroup resource ora.DATA.dg)

The Problem

When looking up

crsctl stat res -t

showed me , that there was no “ora.DATA.dg”, so no wonder. But a SELECT on v$asm_diskgroup confirmed that the diskgroup WAS there, it was even mounted!

Exploring possibilities

Okay, whatever might have been wrong when creating the diskgroup, just let’s create the resource, and off we would go. But Oracle refused to cooperate. SRVCTL only knows operations such as START, STOP, STATUS, ENABLE, DISABLE and REMOVE a diskgroup. I was not able to find something in the official documentations, that describes definitely how a resource for a diskgroup is created. All sources pointed to something automagic. When testing with GI and ASM on my test cluster, it became clear that the resource ora.DATA.dg is added to the resource list when the disk group DATA is mounted by ASM for the very first time. (Some folks on the net indicating that this would happen when the RDBMS is using this DG for the first time are mistaken, and are maybe confused by the experience that the DB access triggers ASM to mount this diskgroup.) But knowing this was no help for a start, because on my new production box this DID NOT happen, and nobody seemed to know why.

This was my manual mount command:

(+ASM1)$ sqlplus / as sysasm
<...>
SQL> alter diskgroup DATA mount;
Diskgroup altered.
SQL>

No error, but as expected, no such disk group resource in crsctl. Now let’s look into the alert log of my first ASM instance ( alert_+ASM1.log):

(+ASM1)$ tail -1000f $ORACLE_BASE/diag/+asm/+ASM1/trace/alert_+ASM1.log
<...>
NOTE: diskgroup resource ora.DATA.dg is online
ERROR: failed to update diskgroup resource ora.DATA.dg

Ah, here we go. But no trace file enlighted the darkness, in fact a grep on ora.DATA.dg in my trace directory found zero results. But this error message leaves two options open for consideration: a) ASM tried to switch an existing resource and fails, because it’s not there or b) ASM tries to create a resource and fails for an unknown reason. But option a) is a thought loop, so what about b)?

A Theory

Facts:

  • All ASM resources depend on ASM, and ASM depends on a listener and a listener depends on network. Oracle Grid Infrastructure usually checks for possibly failing dependencies BEFORE doing anything to avoid creating outages in vain.
  • All ASM processes are running as OS user ORACLE (here).
  • My manually created admin VIP “admin.vip1.res” was created by root (to bring along enough permissions for ifconfig) and oracle has no rights there (not even READ).
    (+ASM1)$ crsctl getperm resource admin.vip1.res
    Name: admin.vip1.res
    owner:root:rwx,pgrp:root:r-x,other::r--,user:root:r-x

Maybe GI tries to look up something and can’t…?

Read more…


By usn in Oracle  .::. Read Comment (1)


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.