Monday, 31 August 2015

Error on auto execute of job SYS.ORA$_AT_SA_SPC_SY (external link)

Today I've got a ticket to service with the content more or less like the following:
ORA-12012: error on auto execute of job "SYS"."ORA$AT_SA_SPC_SY_1202"
ORA-20000: ORU-10027: buffer overflow, limit of 20000 bytes
ORA-06512: at "SYS.DBMS_ADVISOR", line 201
ORA-06512: at "SYS.DBMS_SPACE", line 2480
ORA-06512: at "SYS.DBMS_SPACE", line 2553

The issue is not a big deal for me - a problem with space advisor is not a critical thing IMHO, but by the way I found short nice notice on the subject

Tuesday, 16 June 2015

Restore points

Below You may find a 'session' from enabling restore point technology.
-- first try old a hacker method - just start do it ;-)
SQL> create restore point rp1 guarantee flashback database;
create restore point rp1 guarantee flashback database
ERROR at line 1:
ORA-38784: Cannot create restore point 'RP1'.
ORA-38785: Media recovery must be enabled for guaranteed restore point.
-- thus I see one needs to set db into ARCHIVELOG mode
-- despite that we need also settings for db_recovery_file_dest
-- though we do not need to enable flashback mode

-- creating pfile in order to edit settings
SQL> create pfile from spfile;

File created.

-- create directory for fast recovery area (FRA)
SQL> !mkdir [path]

-- add settings for FRA
SQL> !vim $ORACLE_HOME/dbs/initcdw.ora
SQL> create spfile from pfile;

File created.

-- now enabling ARCHIVELOG mode
SQL> startup nomount

ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size      2228200 bytes
Variable Size    503316504 bytes
Database Buffers  1627389952 bytes
Redo Buffers      4952064 bytes

-- create directory for archive logs
SQL> !mkdir /ora_csdb/oradata/CDW/arch

-- set local destination for archive logs to newly created directory
-- of course it was possible to add it to pfile when editing
-- in case of not setting a custom location the archive logs are stored
-- in $ORACLE_HOME/dbs
SQL> alter system set 
SQL> log_archive_dest_1= 'LOCATION=/ora_csdb/oradata/CDW/arch' 
SQL> ;

System altered.

-- switching to ARCHIVELOG mode
SQL> alter database mount;

Database altered.

SQL> alter database archivelog;

Database altered.

SQL> alter database open;

Database altered.

-- there is no switch of flashback on - so as You see this is quite minimal scenario
-- enabling flashback is not required but at a price - one cannot restore to
-- point in time between snapshots
-- recovery writer RVWR is started
-- as You may see, the technology makes use of flashback fascilites

-- creating a restore point
-- as You see now all works properly 
SQL> create  restore point rp1 guarantee flashback database;
Wed Jun 10 12:44:47 2015
Starting background process RVWR
Wed Jun 10 12:44:47 2015
RVWR started with pid=28, OS id=17240 
Allocated 8388608 bytes in shared pool for flashback generation buffer
Wed Jun 10 12:45:27 2015
Created guaranteed restore point RP1
Restore point created.

-- now preparations for a short test
SQL> conn / as sysdba
SQL> alter user rems identified by simple1;

User altered.

SQL> conn rems/simple1
SQL> create table t1 (a number)
  2  ;

Table created.

SQL> insert into t1 select object_id from dba_objects;

473175 rows created.

SQL> commit;

Commit complete.

-- now we return to the state marked by RP1
SQL> conn / as sysdba

-- we need to stop the database and mount it
SQL> flashback database to restore point rp1;
flashback database to restore point rp1
ERROR at line 1:
ORA-38757: Database must be mounted and not open to FLASHBACK.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size      2228200 bytes
Variable Size    503316504 bytes
Database Buffers  1627389952 bytes
Redo Buffers      4952064 bytes
Database mounted.
SQL> flashback database to restore point rp1;
Wed Jun 10 13:12:29 2015
Flashback Restore Start
Wed Jun 10 13:13:05 2015
Flashback Restore Complete
Flashback Media Recovery Start
 started logmerger process
Wed Jun 10 13:15:29 2015
db_recovery_file_dest_size of 10240 MB is 0.30% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Wed Jun 10 13:17:42 2015
Parallel Media Recovery started with 2 slaves
Wed Jun 10 13:17:43 2015
Recovery of Online Redo Log: Thread 1 Group 7 Seq 20393 Reading mem 0
  Mem# 0: /ora_csdb/oradata/CDW/onlinelog/o1_mf_7_97grvlnx_.log
  Mem# 1: /ora_csdb/oradata/CDW/onlinelog/o1_mf_7_97grvp66_.log
Incomplete Recovery applied until change 117062103544 time 06/10/2015 12:45:28
Flashback Media Recovery Complete
Wed Jun 10 13:20:53 2015
Completed: flashback database to restore point rp1
Flashback complete.

-- after flashback we have to open with RESETLOGS
SQL> alter database open
alter database open
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open

SQL> alter database open resetlogs;    

SQL> alter system set log_archive_format='cdw_%d_%r_%T_%S.arc';

-- %r, %T/%t, %S/%s have to be contained within the archive log format

Wednesday, 10 June 2015

Today's action


During the night we've got some messages from one of the databases' hosts with warning with message about storage space threshold has been crossed. This was a resource on which we put archivelogs, so quite important.


One of my colleagues started with launching manually standard script for backup. Normally such script is called by cron in scheduled moments. Unfortunately and unexpectedly the database has produced a massive number of archivelogs, on which the system has not been prepared at all - thus we saw that apparently there is something wrong with application processing. But then it went to the point the backup script has not been working fast enough to free storage space for further logs and due to some other circumstances jammed itself and backup system administrator has to kill it.

The diagnosis was done with the following tools:
  • OEM - with this we were able to find the malicious application and call for its restart
  • an ad hoc script showed that the problematic transactions frequently finished with rollback, which was the direct cause of massive archivelogs growth. The script allows for realizing the status of a transaction and its progress in the form of used_ublk (i.e. undo blocks used in the transaction). Subsequent calls show the change in number of used blocks - in case of rollbacks the value diminishes of course.
    set linesize 300 pagesize 1000
    col rollback for a25
    SELECT s.username,
           round(r.rssize/1024/1024) rs_sz_mb,
           when bitand(t.flag,power(2,7)) > 0 
           then 'rollback in progress'
           else 'other'
           end as "F Status"
    FROM   v$transaction t,
           v$session s,
           v$rollstat r,
           dba_rollback_segs rs
    WHERE  s.saddr = t.ses_addr
    AND    t.xidusn = r.usn
    AND   rs.segment_id = t.xidusn
    ORDER BY t.used_ublk DESC
Finally the application has been restarted, and the application behaviour returned to its normal pace.


Yet during the diagnosis phase I have started again the backup script to free the space, but the script finished with errors due to earlier move of some archivelogs to another location. I did not know at that time what happened with those logs and unable to contact my colleague, so crosschecked the archivelogs and run backup again. This time all has been working properly. In the meantime sysadmins added some 20G to the archivelog storage resource because still the archivelog production has outpaced the backup process. After the application restart there has been some 20 minutes when some transactions still has been rolled back, but after this time the used storage space has dropped to usual state.
The last part of clean-up was to synchronize rman catalog, register the moved archivelogs and backup them as well.
resync catalog ;
catalog start with '[path]';
backup archivelog sequence between 40556 and 40999 delete all input;

Thursday, 7 May 2015

Source of great tech papers (external link)

Cary Millsap's company Method-R published some tech papers - really interesting thing.

Wednesday, 6 May 2015

Asynch descriptor resize (external link)

Excellent analysis of the event in cases it is seen high in query time by Tanel Poder.
We have experienced this behaviour on some databases, but usually it is bound with a particular session, so we used to kill and repeat the action.

Thursday, 2 April 2015

Zabbix and Oracle DRCP

Here short note on possible usage of database resident connection pool feature with Zabbix. This feature allows for pooling database server processes (so they stay ready to use in opposition to standard dedicated processes, which are created at the connection start and destroyed at the end). While at the beginning it seemed to be a interesting issue, it has limited impact on the Zabbix performance.
The main reason for such situation is the fact the Zabbix server keeps persistent connections to a database - in such case DRCP would be a source of problems rather than a helper. Thus only the GUI part may be a subject to such configuration. Furthermore in future versions of Zabbix the database logic is planned to be located in the API, which in turn will be a part of the Zabbix server as well (while the GUI will call the API), and with such architecture the DRCP will be no longer an option.
Anyway this is still interesting feature assuming one makes an extensive use of the GUI, while on version below 3.X
-- this call creates a pool on a database side - first argument will 
-- become a pool name in the future (currently there is only 1 pool)
    null, minsize=5, maxsize=20, inactivity_timeout=>300, max_think_time=>600);
-- this call starts the pool
exec dbms_connection_pool.start_pool();
-- this call stops the pool
exec dbms_connection_pool.stop_pool();
-- calls like the one below allows pool parameters change
    null, param_name=>'session_cached_cursors', param_value=>'50');