Archive for the 'Linux / Unix' Category

New German Linux Forum (forum.linux-talk.de)

In the last weeks, some folks were busy to build a new German Linux Forum “forum.linux-talk.de”, since the predecessor was systematically ruined by the commercial owners.

Especially Jean (wdp) and Hendrik (Nilpferd) invested much time and money into building the new environment. So the new forum is completely free of ads and commercials, and the content is QA’ed by a team of experienced Linux admins as moderators.

Please hang out there, and help us to (re)build a cool community.

Cheers
Martin Klier (Usn)



Oracle on Linux: How to hide your password when using a wrapper script

Sometimes, a DBA has to write an externally called wrapper script for various Oracle-supplied commands accepting password inputs. A prominent and simple example is SQL*plus (sqlplus).

The Problem

The process list shows all parameters of a command that’s currently executed.

wrapper1.sh

Accepts all connection infos on the command line:

#!/bin/bash
 cmdstring="sqlplus ${1}/${2}@${3}"
 echo "Executed command: $cmdstring"
 sleep 999

Called by:

$ ./wrapper1.sh system manager mydb
 Executed command: sqlplus system/manager@mydb

But now, the password is visible in the process list:

$ ps aux | grep wrapper1.sh | grep -v grep
 1000     20769  0.0  0.0  13808  1444 pts/1    S+   15:19   0:00 /bin/bash ./wrapper1.sh system manager mydb Read more...


Oracle Clusterware root.sh issue: USM driver install actions failed (oracleoks.ko)

As I already said in my last post about “Can’t install ohasd service“, setting up Oracle Clusterware 11.2.0.4 on SuSE Linux Enterprise Server (SLES) SP2 should work flawlessly, but sometimes it does not. :) This time, it was about the USM drivers.

USM driver install actions failed
/u01/app/grid/11.2.0/perl/bin/perl -I/u01/app/grid/11.2.0/perl/lib 
-I/u01/app/grid/11.2.0/crs/install
/u01/app/grid/11.2.0/crs/install/rootcrs.pl execution failed

USM drivers are components (Kernel object files, extension .ko) enabling ACFS – I don’t use it on this system, but root.sh (in fact, rootcrs.pl) needs a decent directory structure related to the Linux Kernel version: Again, the log file “$GRID_HOME/cfgtoollogs/crsconfig/rootcrs_<hostname>.log” was my friend: It unveiled, that the problem was somewhat related to loading oracleoks.ko. And this file is in directory “$GRID_HOME/install/usm/Novell/SLES11/x86_64/<your-kernel-version>/default/bin”. Trouble is, that good old SLES 11 SP2 has a Kernel that was not foressen by the Oracle folks implementing this piece of software.

Read more…



Oracle Clusterware root.sh fails: Can’t install ohasd service: Inappropriate ioctl for device crsconfig_lib.pm line 5427

Setting up Oracle Clusterware 11.2.0.4 on SuSE Linux Enterprise Server (SLES) SP2 should work flawlessly, but sometimes it does not. :) It turned out that this would become a pair of blog entries. Second one is about “USM driver install actions failed (oracleoks.ko)“. But step by step. On Saturday morning, root.sh failed with the following error:

Failed to install ohasd startup script, error: Can’t install ohasd service: Inappropriate IOCTL (I/O-Control) for device

Can’t install ohasd service: Inappropriate IOCTL (I/O-Control) for device at /u01/app/grid/11.2.0/crs/install/crsconfig_lib.pm line 5427.

/u01/app/grid/11.2.0/perl/bin/perl -I/u01/app/grid/11.2.0/perl/lib -I/u01/app/grid/11.2.0/crs/install /u01/app/grid/11.2.0/crs/install/rootcrs.pl execution failed

There are several “My-Oracle-Support” (MOS) entries (bug notes and documents) for root.sh failing in crsconfig_lib.pm, but not for line 5427 – and the line really matters! This script does a lot, and usually different things in different lines. :)

Whenever dealing with root.sh malfunctions, the rootcrs logfile ($GRID_HOME/cfgtoollogs/crsconfig/rootcrs_<hostname>.log) is your best friend. It appears in a not-too-verbose style, and if rootcrs.pl invokes OS- or third party commands, it quotes those outputs in a useful way – Bravo Zulu for the Oracle scripters here.

In my particular case, the problem was related to Linux’ insserv command, thats used to integrate ohasd into the SYS V startup script structure. My IBM Storage Manager Agent (service SMagent) and Oracle’s Trace File Analyzer (service init.tfa) had a dependency loop (dumbass SMagent depends on $all, /*NO COMMENT*/). In my case, I happily removed the $all dependency, and off it went.

Good luck with your GI
Martin



Oracle on AIX: How to find out the process memory usage

Calculating memory on Unix is tricky business. Especially when a complex software like Oracle Database has shared memory segments like SGA and Code Area.

One might be convinced to use the following construction to calculate the overall memory footprint of Oracle processes running on this machine:

ps -elf |egrep " oracle* | ora_.*_* " | grep -v egrep \\
| awk '{sum += $10} END {print sum/1024/1024}'

But that’s bad, since the sum is based on the SZ column of the “ps -elf” command. Unfortunately, SZ displays the full core image, but most of it is shared (remember the Oracle Code Area from the architecture diagram). So we greatly overestimate the memory footprint this way.

aix-memory-calculation

When you use “ps v” for a given PID, you get it more detailled: SIZE is the non shared data rump, TSIZE the shared text component of the image. In sum, they roughly add up to SZ.
(Units are all in KB)

I tried to find a solution. This is the original, overestimated version:

# ps -elf |egrep " oracle* | ora_.*_* " | grep -v egrep \\
| awk '{sum += $10} END {print sum/1024/1024}'
19.0745
(GB)

This one extracts the PID from “ps -ef”, executes “ps v” for each and adds them up. The greps might be a bit ugly, but it works for Oracle. :)

# for X in $(ps -ef | egrep " oracle* | ora_.*_*  " | grep -v egrep | awk '{print $2}'); \\
do ps v $X | grep ora | awk '{print $6}'; done \\
| awk '{sizesum += $1} END {print sizesum/1024/1024}'
1.57206
(GB)

I ran both commands on the same prod database system within the same second, so the difference should be realistic.

Stay safe
Martin

Thanks to Maxym’s old blog entry for great impressions!

Additional reading:
https://www.ibm.com/developerworks/community/blogs/aixpert/entry/aix_memory_usage_or_who_is_using_the_memory_and_how20?lang=en



Speaking at COLLABORATE 14: “YOUR machine and MY database – a performing relationship!?”

I’m excited to announce that IOUG accepted my talk

“YOUR machine and MY database – a performing relationship!?”

for COLLABORATE 14 in Las Vegas.

collaborate14-logo

I’d love to see you there – for tech talk, gossip and meeting old and new friends!

Abstract:

Databases affect machines, machines affect databases. Optimizing one is pointless without knowing the other. System administrators and database administrators will not necessarily have the same opinion – often because they know little about the opposite’s needs. This lecture was made to promote understanding – showing how the database can stress the server, and how the server can limit the database. And why two admins sometimes don’t speak the same language, not even with a developer as an interpreter.

  • Recall the different needs of different technical layers underneath a database system.
  • Understand the technical collaboration of hardware, operating system and database.
  • Plot ways how to avoid collisions, competition and concurrency.
  • Promote collaboration!

Date, time and location:

Thu, Apr 10, 2014
01:00 p.m. – 02:00 p.m.

Level 3, Lido 3003

The Venetian and Sands Expo Center
201 Sands Ave
Las Vegas, NV 89169
USA

Presentation and papers

2014_141_Klier_odp_v1
2014_141_Klier_v1_doc



Martin Klier now on twitter

After ignoring the little bird telling things for quite a while, I decided to join the tweeters. Twitter might bring more color into my daily reading. :)

If you feel like, just follow me – @MartinKlierDBA



Oracle: RHEL6 and Oracle Enterprise Linux 6 certified for Database 11.2.0.3

Hi Linux-DBAs,
Red Hat Enterprise Linux 6 and Oracle Enterprise Linux 6 are now certified for Oracle Database 11.2.0.3.

Please see this link:
https://support.oracle.com/CSP/main/article?cmd=show&id=1441282.1&type=NOT

Best regards
Martin



DOAG Conference 2011 – Impressions and Look-at’s

Once again this year, the German Oracle Users Group has its annual conference and exhibition in Nuremberg. (DOAG Konferenz und Ausstellung 2011, Nürnberg). Being there is nearly a must for Oracle guys in German speaking countries.

As usual, here comes my unordered, incomplete and ad-hoc list of things I wrote down to have a closer look at in the next year, coming up during or from random talks I attended.

Day 1 (Tue 15.11.2011)

  • AVG_ROW_LENGTH of a table vs. Blocksize
  • CHAIN_CNT
  • analyze table X validate stucture cascade
  • Linux: Transcendent Memory
  • Linux: CleanCache and zcache
  • Linux: Cgroups
  • Linux: Transparent Huge Pages (wow!)
  • Linux: DTrace
  • Linux proprietary: Ksplice
  • View: registry$history for the REAL version number
  • Rolling Upgradable patches means minimal downtime on on Single Instance DBs, in combination with Out-of-Place-Upgrade)
  • Bug 10187168 in PSU 11.2.0.2.2 (_cursor_features_enables=1026
  • Typeset conversions: CSscan and DMU utilities
  • AIX patch following note 1246995 (Memory Footprint)

Read more…



Oracle Clusterware 11.2: ASM crashes at startup

These days, a customer’s Oracle Clusterware (2 nodes) crashed one ASM instance at every startup.

More Facts:

  • It was not possible to start it manually, too.
  • The CSSD was running.
  • For obvious reasons, CRSD did not start.
  • The other ASM instance in the cluster recognized CLUSTER RECONFIGURATION for a short period of time.

The ASM Alert Log file looked like:

Sun Nov 13 13:44:08 2011
 MMNL started with pid=21, OS id=7783
 lmon registered with NM - instance number 2 (internal mem no 1)
 Sun Nov 13 13:46:05 2011
 System state dump requested by (instance=2, osid=7684 (PMON)),
         summary=[abnormal instance termination].
 System State dumped to trace file /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_diag_7706.trc
 Sun Nov 13 13:46:05 2011
 PMON (ospid: 7684): terminating the instance due to error 481
 Dumping diagnostic data in directory=[cdmp_20111113134605], requested by (instance=2, osid=7684 (PMON)),
         summary=[abnormal instance termination].
 Instance terminated by PMON, pid = 7684

Strange problem. Looking up device permissions, read write tests, rebooting the cluster in a downtime window – nothing.

To make a long story short: The NTP daemon did not get his time synchronisation, but was running. Thus, CTSS was in observer mode, and server time started drifting apart. Fixing NTP, fixed the cluster.

Nota bene
Martin




You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.