Personal tools
You are here: Home DB2 Problem Resolution Database Crashes Invetigating database crashes
Navigation
Log in


Forgot your password?
 
Document Actions

Invetigating database crashes

Steps to found the root cause of database crashes

In order to investigate these issues, you need to understand and distinguish the
difference between a database crash and an application crash.

Determine if the database instance has crashed:

To determine if the database instance has crashed the administration notification log at various probe points must be examined. In this file, you may see the following:

2005-01-29-03.13.07.166360 Instance:db2inst1 Node:000
PID:1310914(db2agent (BIDB) 0) TID:1 Appid:*N0.db2inst2.050128190001
oper system services sqloEDUCodeTrapHandler Probe:10 Database:SAMPLE
ADM0503C An unexpected internal processing error has occurred. ALL DB2
PROCESSES ASSOCIATED WITH THIS INSTANCE HAVE BEEN SHUTDOWN. Diagnostic
information has been recorded.


The db2diag.log would show additional information, as follows:

2005-01-20-11.15.04.507340-360 E5806002A512 LEVEL: Severe
PID : 282632 TID : 1 PROC : db2agntp 0
INSTANCE: db2inst1 NODE : 000
FUNCTION: DB2 UDB, oper system services, sqloEDUCodeTrapHandler,
probe:10
MESSAGE : ADM0503C An unexpected internal processing error has
occurred. ALL DB2 PROCESSES ASSOCIATED WITH THIS INSTANCE HAVE BEEN
SHUTDOWN. Diagnostic information has been recorded. Contact IBM Support for
further assistance.

2005-01-20-11.15.04.508711-360 E5806515A645 LEVEL: Severe
PID : 282632 TID : 1 PROC : db2agntp 0
INSTANCE: db2inst1 NODE : 000
FUNCTION: DB2 UDB, oper system services, sqloEDUCodeTrapHandler,
probe:20
DATA #1 : Signal Number Recieved, 4 bytes
11

DATA #2 : Siginfo, 64 bytes
0x0FFFFFFFFFFFE050 : 0000 000B 0000 0000 0000 0032 0000 0000
...........2....
0x0FFFFFFFFFFFE060 : 0000 0000 0000 0000 0000 0001 1170 A370
.............p.p
0x0FFFFFFFFFFFE070 : 0000 0000 0000 0000 0000 0000 0000 0000
................
0x0FFFFFFFFFFFE080 : 0000 0000 0000 0000 0000 0000 0000 0000
................


In this example, second db2diag.log entry shows that the function
sqloEDUCodeTrapHandler has returned a Signal number of 11.

This means that the DB2 signal handler has caught a signal #11. On the UNIX platform the header file called signal.h is usually located in /usr/include/sys.

In this example you will determine that a signal #11 is a segmentation violation (SIGSEGV):

Extract of the signal.h header file

...
#define SIGBUS 10 /* (*) bus error (specification exception) */
#define SIGSEGV 11 /* (*) segmentation violation */
#define SIGSYS 12 /* (*) bad argument to system call */
...


    This is the first indication that the database has indeed crashed due to a
segmentation violation and the database signal handler has caught the signal.


Next step is to determine the process ID (PID) that has crashed.


We return to the db2diag.log file, to find the abnormally terminated process:

2005-01-20-11.15.04.558786-360 I5807161A433 LEVEL: Severe
PID : 2027586 TID : 1 PROC : db2gds 0
INSTANCE: db2inst1 NODE : 000
FUNCTION: DB2 UDB, oper system services, sqloEDUSIGCHLDHandler, probe:50
DATA #1 : String, 160 bytes
Detected the death of an EDU with process id 282632
The signal number that terminated this process was 11
Look for trap files (t282632.*) in the dump directory


    The function sqloEDUSIGCHLDHandler at probe 50 has provided the process id of the problematic EDU, and the name of the trap file to reference. For this example you
will get a file called t282632.000 in the DIAGPATH directory. On some platforms such
as AIX, a CORE file may be generated as well.

   
    The trap file contains a stack traceback of all the functions on the stack for the
process that crashed.

Security Awareness
Would you like your company to implement gamification into your security awareness program?





Polls