Thursday, 5 June 2014

Mysterious problem with archivelog replication

The starting points to this short story are the following:
  • we have a system with active-passive configuration
  • lately one of us made a failover to another node
  • the configuration actually not changed and used for some time - anyway we checked it few times and there were no errors
  • the standby was recreated
  • local application of archivelogs on a standby was successful

The problem was the primary database did not send the archivelogs to the standby.
The entry in V$ARCHIVE_DEST_STATUS indicated wrong unique name in the configuration. Not sure if recall correctly, but I think it was ORA-16053 listed there. We checked the configuration at least few times by few pairs of eyes and not spotted anything wrong.
There was hanging one of ARCH processes on the primary, which tried to send an archivelog from before the failover incarnation, so we suspected it may be it, but killing it did not change anything.

However even though the real cause stays mysterious, the solution has been quite obvious and strightforward. A collegue of mine configured another archive destination - we disabled number 2 and enabled number 3 - the system return to work properly.

No comments: