Tuesday, March 6, 2018

ORA-00742: Log read detects lost write in thread %d sequence %d block %

During real time apply on one of my physical standby RAC database , the managed recovery process crashed with this error message, following is the entry in alert log file.
CORRUPTION DETECTED: In redo blocks starting at block 169592count 142 for thread 4 sequence 157
Sat Jul 02 19:12:25 2016
MRP0: Background Media Recovery terminated with error 742

Errors in file /u01/app/oracle/diag/rdbms/mydg/mydg1/trace/mydg1_pr00_68237.trc:
ORA-00742: Log read detects lost write in thread %d sequence %d block %d
ORA-00312: online log 46 thread 4: '/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/srl6.f'
Sat Jul 02 19:12:25 2016

Apparently the reason seems to be that my standby redo log was on local file system (not on the shared storage), as can be seen in highlighted text above. Managed recovery was running on instance 1 whereas this standby log belonged to instance 4, and was also on the local storage of instance 4, therefore it was not accessible from instance 1 that caused the managed recovery process crash.

To solve this, I simply dropped this standby redo log group because rest of the standby redo log groups from all instances were already on the shared storage. After dropping it, a new redo log group was added with all of its members on the shared storage.

This should be noted that in case of RAC, all controlfiles, datafiles, redo log files and archived logs files should be stored on the shared file system accessible to all of the instances of the RAC database. Even if a single file from the above files is not accessible to all of the instances, you may face different kind of problems. For example, if a single datafile is created on a local file system of a RAC node, other instances would not be able to write on the datafile, and application would be returned error message. This can result in an instance crash if datafile belongs to system, sysaux or current undo tablespace. Having controlfile(s) on the local file system of one of the nodes would cause other instances to crash (not start). Similar issues would arise if redo log files and archived logs are not stored on a shared file system



No comments:

Post a Comment

Popular Posts - All Times