Wednesday, 7 August 2013

The problem with rebuilding a chain job

One of our ETL processes is based upon chain jobs and once we had a problem with it after moving to a test environment.
The part of the solution is a procedure, which completely erases the whole chain and builds it from scratch and one would think this should work no matter what. Yet we have got the ORA-24155 exception (rule string.string not in rule set string.string) during the trial to rebuild this chain.

Firstly I tried to add such a rule to the ruleset, but then there was another exception in another step, and another exception... and so on. Despite that the procedure worked longer and longer and at some point it simply run forever (from the standpoint of human being). Apparently there was something more in this.
One thing to realize is that rules have more general character, so they are used in few places in the Oracle RDBMS - with the special place in the Streams technology. The packages for the rules manipulation are mainly DBMS_RULE and DBMS_RULE_ADM, which may be not so obvious assuming all the job for chains is done with calls to DBMS_SCHEDULER.
The relationships between the catalog views regarding rules and rule sets are also not too clear as there are DBA_RULE_SETS and DBA_RULESETS for example. Yet by and large this is organized in the following way:
  • for every chain there is a rule set (or ruleset ;-)) defined, which gathers all the rules for this chain (DBA_SCHEDULER_CHAINS)
  • there are also rules for which there is a defined chain (DBA_SCHEDULER_CHAIN_RULES)
  • despite that there are also the dependencies defined between rule sets and rules (visible in DBA_RULE_SET_RULES)
  • also rule sets are enlisted in DBA_RULE_SETS and DBA_RULESETS (with slightly different information)
What is crucial here is that both rule set and rules are bound with a chain. So what is the problem core? The chain drop operation. Apparently there are situations when not all the rules are dropped with dropped chain, while our code does exactly this - calls DBMS_SCHEDULER.DROP_CHAIN assuming all the dependent objects will be gone.
Now if in next step the procedure tries to build the chain anew, it creates a new rule set bound to the chain and new rules. It seems to me however, that in case of rules it is acceptable for the DBMS_SCHEDULER code to find the already existing rules. But then they have a different rule set than the one created anew, so we get the ORA-24155 exception.

And the solution? In this case I just removed all the rules defined by the chain now and in the past with calls to DBMS_RULE_ADM and run the code again. This time all went well and the chain was created.

No comments: